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We analytically study the dynamics of evolving populations that exhibit metastability 
on the level of phenotype or fitness. In constant selective environments, such metastable 
behavior is caused by two qualitatively different mechanisms. One the one hand, populations 
may become pinned at a local fitness optimum, being separated from higher-fitness genotypes 
by a fitness barrier of low-fitness genotypes. On the other hand, the population may only be 
metastable on the level of phenotype or fitness while, at the same time, diffusing over neutral 
networks of selectively neutral genotypes. Metastability occurs in this case because the 
population is separated from higher-fitness genotypes by an entropy barrier. The population 
must explore large portions of these neutral networks before it discovers a rare connection 
to fitter phenotypes. 

We derive analytical expressions for the barrier crossing times in both the fitness barrier 
and entropy barrier regime. In contrast with "landscape" evolutionary models, we show that 
the waiting times to reach higher fitness depend strongly on the width of a fitness barrier and 
much less on its height. The analysis further shows that crossing entropy barriers is faster 
by orders of magnitude than fitness barrier crossing. Thus, when populations are trapped in 
a metastable phenotypic state, they are most likely to escape by crossing an entropy barrier, 
along a neutral path in genotype space. If no such escape route along a neutral path exists, 
a population is most likely to cross a fitness barrier where the barrier is narrowest, rather 
than where the barrier is shallowest. 
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I. INTRODUCTION 



For populations evolving under selection, muta- 
tion, and a static fitness function, there are two 
main mechanisms thought to be responsible for the 
occurrence of dynamical metastability — a behavior 
commonly observed in natural and artificial evolu- 
tionary processes |T| f^j8||ic|]2"3f| and called punctuated 
equilibria in paleobiology E3. First, a population 
may become trapped around a local optimum in the 
fitness "landscape" until a rare mutant crosses a fit- 
ness barrier to a higher nearby peak. Second, more 
recently it has been proposed ]I(],|l5|,^9| that popu- 
lations may evolve neutrally, drifting randomly over 
neutral networks of isofitness genotypes in genotype 
space, until a rare single-point mutant connection is 
found to another neutral network of higher fitness. 
In this case, the population must cross an entropy 
barrier by visiting a large volume of the neutral net- 
work before it discovers a path to higher fitness. 



To understand the relative roles of these two 
mechanisms in evolutionary metastability, in the fol- 
lowing we study the dynamics of a population evolv- 
ing under simple fitness functions that contain a 
single fitness barrier of tunable height and width. 
In order for the population to escape its current 
metastable state and so reach higher fitness, it must 
create a genotype that is separated from the cur- 
rent fittest genotypes in the population by a valley 
of lower-fitness genotypes. The height of the fitness 
barrier measures the relative selective difference be- 
tween the current fittest genotypes and the lower- 
fitness genotypes in the intervening valley. Its width 
denotes the number of point mutations the current 
fittest genotypes must undergo to cross the valley of 
low fitness genotypes. We derive explicit analytical 
predictions for the barrier crossing times as a func- 
tion of population size, mutation rate, and barrier 
height and width. The scaling of the fitness-barrier 
crossing time as a function of these parameters shows 
that the waiting time to reach higher fitness depends 
crucially on the width of the barrier and much less 
on the barrier height. 

This contrasts with the scaling of the barrier cross- 
ing time for a particle diffusing in a double-well 
potential — a model proposed previously for popu- 
lations crossing a fitness barrier |2(],|2^]. For such 
stochastic processes, it is well known that the wait- 
ing time scales exponentially with the barrier height 
p2[ . In the population dynamics that we analyze 
here, we find that the waiting time scales approxi- 
mately exponential with barrier width and only as 
a power law of the logarithm of barrier height. In 
addition, the waiting time scales roughly as a power 
law in both population size and mutation rate. 

When the barrier height is lowered below a critical 
height, the fitness barrier turns into an entropy bar- 
rier. We show that, in general, neutral evolution via 
crossing entropy barriers is faster by orders of mag- 
nitude than fitness barrier crossing. Additionally, 
we show that the waiting time for crossing entropy 
barriers exhibits anomalous scaling with population 
size and mutation rate. 

Finally, we extend our analysis to a class of more 
complicated fitness functions that contain a network 
of tunable fitness and entropy barriers. We show 
that the theory still accurately predicts fitness- and 
entropy-barrier crossing times in these more compli- 
cated cases. 

The general conclusion drawn from our analysis is 
that, when populations are trapped in a metastable 
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phenotypic state, they are most likely to escape this 
metastability by crossing an entropy barrier. That 
is, the escape to a new phenotype occurs along a 
neutral path in genotype space. If no such neutral 
path exists, then the population is most likely to 
cross a fitness barrier at the place where the barrier 
is narrowest. 



A. Evolutionary Pathways and Metastability 



The notion of an adaptive landscape, first intro- 
duced by Wright has had a large impact on 
our appreciation of the mechanisms that control how 
populations evolve in static environments. The in- 
tuitive idea is that a population moves up the slopes 
of its fitness "landscape" just as a physical system 
moves down the slope of its potential-energy surface. 
Once this analogy has been accepted, it is natural 
to borrow many of the qualitative results on the dy- 
namics of physical systems to account for the dy- 
namics of evolving populations. For instance, it has 
become common to assume that an evolving popu- 
lation can be modeled by a single uphill walker in a 
"rugged" fitness landscape p6| , pl[ . 

There are, however, seemingly different kinds of 
evolutionary behavior than incremental adaptation 
via "landscape" crawling. For example, metastabil- 
ity or punctuated equilibrium of phenotypic traits 
in an evolving population appears to be a common 
occurrence in biological evolution |||l3| as well as in 
models of natural and artificial evolution [p]J^JlO[| . As 
just pointed out, for simple cases where populations 
evolve in a relatively constant environment, there 
are two main mechanisms that have been proposed 
to account for this type of metastable behavior. 

The first and most commonly accepted explana- 
tion was already implicit in Wright's shifting balance 
theory |Q. A population moves up the slope of its 
fitness "landscape" until it reaches a local optimum, 
where it stabilizes. The population is pictured as 
a cloud in genotype space focused around this lo- 
cal optimum. The population remains in this state 
until a rare sequence of mutants crosses a valley of 
low fitness towards a higher fitness peak. In this 
view, metastability is the result of fitness barriers 
that separate local optima in genotype space. 

This mechanism for metastability is very reminis- 
cent of that found in physical systems. Metastability 
occurs there because local energy minima in state 



space are separated by potential energy barriers, 
which impede the immediate transition between the 
minima. A physical system generally moves through 
its state space along trajectories that lower its en- 
ergy. Once it reaches a local minimum it tends to 
stay there. However, when such a system is sub- 
ject to thermal fluctuations, through a sequence of 
chance events it can eventually be pushed over a bar- 
rier that separates the current local minimum from 
another. When this transition occurs, it turns out 
that the system moves quickly to the new local min- 
imum. 

Mathematically, barrier crossing processes in 
physical systems are most often described as diffu- 
sion in a potential field, where the potential repre- 
sents the energy "landscape" . These processes have 
been extensively studied and the basic quantitative 
results are widely known [ pdp^p^ ]. For example, 
barrier crossing times increase exponentially with 
the height of the barrier and inverse exponentially 
with the fluctuation amplitude, as measured by tem- 
perature. 

In light of the physical metaphor for evolving pop- 
ulations, it is not surprising that the dynamics of 
populations crossing fitness barriers has been mod- 
eled using a class of diffusion equations analogous 
to those used to describe thermally driven systems 
in a potential |2^,|2^]. In this approach, the dy- 
namics of the average fitness of the population is 
modeled as diffusion over the "fitness landscape", 
thermal fluctuations are replaced by random genetic 
mutations and drift, and the population size, which 
controls sampling stochasticity, plays the role of in- 
verse temperature. As a direct consequence, it was 
found that fitness-barrier crossing times scale expo- 
nentially with population size in these models. Note 
that it is assumed in this approach that the popula- 
tion as a whole must cross the fitness barrier. 

In the following, we show that the analogy with 
the physical situation and, in particular, the trans- 
lation of results from there are misleading for the 
understanding of the evolutionary dynamics. For 
example, a direct analysis of the population dynam- 
ics reveals that for most parameter ranges, the time 
to cross a fitness barrier scales very differently for 
populations evolving under selection and mutation. 
For example, the waiting time is determined by how 
long it takes to generate a rare sequence of mutants 
that crosses the fitness barrier, as opposed to how 
long it takes the population as a whole to cross the 
fitness barrier. 
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This brings us to the second main mechanism for 
metastability — one that has been put forward more 
recently [^10 T^j23|,^{|. The second mechanism de- 
rives from the observation that large sets of mu- 
tually fitness-neutral genotypes are interconnected 
along single-point mutation paths. That is, sets of 
isofitncss genotypes form extended neutral networks 
under single-point mutations in genotype space. 

In this alternative scenario, a population displays 
a constant distribution of phenotypes for some pe- 
riod while, at the same time, individuals in the 
population diffuse over a neutral network in geno- 
type space. That is, despite phenotypic metastabil- 
ity, there is no genotypic stasis during this period. 
The phenotype distribution remains metastable un- 
til, via diffusion over the neutral network, a member 
of the population discovers a genotypic connection 
to a higher-fitness neutral network. 

When this mechanism operates, metastability is 
the result of an entropy barrier, as we call it. The 
population must spread over or search large parts of 
the neutral network before it finds a connection to 
a higher-fitness network. One envisages the popula- 
tion moving randomly through a genotypic labyrinth 
of common phenotypes with only a single or rela- 
tively few exits to fitter phenotypes. 



B. Overview 



In the following, we analyze and compare the pop- 
ulation dynamics of crossing such fitness and entropy 
barriers with the goals of elucidating the basic mech- 
anisms responsible for each, calculating the scaling 
forms for the evolutionary times scales associated 
with each, and understanding their relative impor- 
tance when both can operate simultaneously. 

Section || defines the basic evolutionary model. 

Section ^ introduces a tunable fitness function 
that models the simplest case in which to study 
both types of barrier crossing. It consists of a sin- 
gle local optimum, with a valley, and a single por- 
tal (target genotype) in genotype space. By tuning 
the height of the local optimum one can change the 
fitness barrier into an entropy barrier. We analyze 
this basic model as a branching process, calculat- 
ing the statistics of lineages of individuals in the 
fitness valley. Comparison of the theoretical pre- 
dictions for the fitness-barrier crossing times with 
data obtained from simulations shows that the the- 



ory accurately predicts these fitness-barrier crossing 
times for a wide range of parameters. We also de- 
rive several simple scaling relations for the fitness- 
barrier crossing times appropriate to different pa- 
rameter regimes. 



Section IV first determines the barrier heights at 
which the fitness-barrier regime shifts over into an 
entropic one. After this, we discuss the popula- 
tion dynamics of crossing entropy barriers, provid- 
ing rough scaling relations for the barrier crossing 
times in this regime. Comparison of these results 
with the scaling relations for fitness-barrier cross- 
ing shows that entropy-barrier crossing proceeds 
markedly faster than crossing fitness barriers. 

Section |v| extends our analysis to a set of much 
more complicated fitness functions — a class called 
the Royal Staircase with Ditches. These fitness func- 
tions are closely related to the Royal Road [ p9|3C| ] 
and Royal Staircase J2?],|2^] fitness functions that we 
studied earlier, which consist of a sequence or a net- 
work of entropy barriers only. The Royal Staircase 
with Ditches generalizes this class of fitness func- 
tions to one that possesses multiple fitness and en- 
tropy barriers of variable width, height, and volume. 
We adapt the theoretical analysis using our statisti- 
cal dynamics approach 1 30 1 to deal with these more 
complicated, but more realistic cases. Comparison 
of the theoretically predicted and experimentally ob- 
tained barrier crossing times again shows that the 
theory accurately predicts the barrier crossing times 
in these more complicated situations as well. 

Finally, Sec. [vj presents our conclusions and dis- 
cusses the general picture of metastable population 
dynamics that emerges from our analyses. 



II. EVOLUTIONARY DYNAMICS 



We consider a simple evolutionary dynamics of 
selection and mutation with a constant population 
size M. An individual's genotype consists of a bi- 
nary sequence of on-or-off genes. We consider the 
simple case in which the fitness of an individual is 
determined by its genotype only. The genotype-to- 
phenotype and phenotype-to-fitness maps are col- 
lapsed into a direct determination of a genotype's 
fitness. Selection and reproduction are assumed to 
take place in discrete generations, with mutation oc- 
curring at reproduction. The exact evolutionary dy- 
namics is defined as follows. 
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• A population consists of M binary sequences 
of a fixed size L. 

• A fitness f s is associated with each of the 2 L 
possible genotypes s £ A L , where A = {0, 1}. 

• Every generation M individuals in the current 
population are sampled with replacement and 
with a probability proportional to their fitness. 
Thus, the expected number of offspring for an 
individual with genotype s is f s /(f), where (/} 
is the current average fitness of the population. 

• Once the M individuals have been selected, 
each bit in each individual is mutated (flipped) 
with probability fj,, the mutation rate. 

In this basic model there are effectively two evolu- 
tionary parameters: the mutation rate \x and the 
population size M. 

Several aspects of the basic model — such as, dis- 
crete generations and fixed population size — were 
mainly chosen for analytical convenience. The 
discrete-generation assumption can be lifted, lead- 
ing to a continuous-time model, without affecting 
the results presented below. As for the assumption 
of fixed population size, the analysis can be adapted 
in a straightforward manner to address (say) fluctu- 
ating or exponentially growing populations. 

Models including genetic recombination are no- 
toriously more difficult to analyze mathematically. 
Despite our interest in the effects of recombina- 
tion, it is not included here largely for this reason. 
Moreover, for wide parameter ranges in the neutral 
and piecewise-neutral evolutionary processes we con- 
sider, it appears that recombination need not be a 
dominant mechanism. For example, Refs. [30| , sec. 
6.5] and sec. VIII] show that recombination of- 
ten does not significantly affect population dynamics 
in these cases. 



III. CROSSING A SINGLE BARRIER 



We first consider the simple case of a single bar- 
rier for the population to cross. Of the 2 L geno- 
types, there is one with fitness a > 1 that we refer 
to as the peak genotype II. Then there are 2 L — 2 
genotypes with fitness 1 that we refer to as valley 
genotypes. Finally, there is a single portal genotype 
fl at a Hamming distance w from the fitness-cr peak 
genotype II. We view the portal genotype as giving 



access to higher-fitness genotypes — genotypes whose 
details are unimportant, since in this section we only 
analyze the dynamics up to the portal's first discov- 
ery. 

The variable w tunes the fitness barrier's width 
and the variable a its height. The height a also in- 
dicates a peak individual's selective advantage over 
those in the valley, as measured by the relative dif- 
ference o~ — 1 of their expected number of offspring. 
Figure [l| illustrates the basic setup. 



Peak n 
f = o 



Portal 




FIG. 1. Evolution from the peak genotype IT to the 
higher-fitness portal genotype SI via low-fitness valley 
genotypes. The selective advantage of the peak indi- 
viduals over those in the valley is controlled by the peak 
height a. The portal and peak genotypes are a Ham- 
ming (mutational) distance w apart. The domain is the 
hypercube A L of all length-!/ genotypes. 



At time t — the population starts with all M 
genotypes located at the peak H We then evolve 
the population under selection and mutation, as de- 
scribed in the previous section, until the portal geno- 
type fl occurs in the population for the first time. 
(Hence, the portal's fitness is not relevant.) This de- 
fines one evolutionary run. We record the time t at 
which the portal is discovered. We are interested in 
the average discovery time (t) , averaged over an en- 
semble of such runs. We are particularly interested 
in the scaling of this barrier crossing time (t) as a 
function of the evolutionary parameters M and /z, 
as well as the barrier parameters a and w. 

Let's briefly review in simple language the evolu- 
tionary dynamics before launching into the math- 
ematical analysis. In the parameter regime with 
a>l, where the peak fitness is considerably larger 
than the valley fitness, and with the mutation rate /i 
not too high, the bulk of the population remains at 
the peak. That is, the population is a quasispecies 
cloud, centered around the peak genotype n M. For 
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such parameter regimes, the barrier is clearly a fit- 
ness barrier: the waiting time (t) is determined by 
the time it takes to create a rare sequence of mutant 
genotypes that crosses the valley between the peak 
and the portal. 

However, as a — > 1 + , the fitness barrier trans- 
forms into an entropy barrier. For a = 1 there is no 
fitness difference between peak and valley genotypes 
and the entire population simply diffuses through 
genotype space until the portal is discovered. As we 
will see below, the entropic regime sets in rather sud- 
denly at a value of a c somewhat above £7 = 1. As we 
show, this transition is the well known error thresh- 
old of molecular evolution theory At a = <r c , 
the value of which depends on the population size 
M and mutation rate \x, the subpopulation on the 
peak becomes unstable in the sense that all individ- 
uals on the peak may be lost through a fluctuation. 
More precisely, the waiting time for such a fluctu- 
ation to occur becomes short in comparison to the 
fitness-barrier crossing time. When this fluctuation 
has occurred, there is no longer a restoring "force" 
that keeps the population concentrated around the 
peak genotype. The population as a whole diffuses 
randomly through the valley as if the genotypes were 
all fitness neutral. While our analysis accurately pre- 
dicts the barrier crossing times in the fitness-barrier 
regime, it is notable that beyond the error threshold, 
in the entropic regime, only order-of-magnitude pre- 
dictions can be obtained using the current analytical 
tools. 

Calculating the barrier crossing time proceeds in 
three stages. First, in Sec. Ill A we determine the 
population's quasispecies distribution, defined as the 
average proportions of individuals located on the 
peak and in the valley during the metastable state. 
From this, one directly calculates t he av erage fitness 
in the population. Second, in Sec. Ill B wc consider 



the genealogy statistics of individuals in the valley. 
In the fitness-barrier regime, genealogies in the val- 
ley are generally short-lived and are all seeded by 
mutants of the peak genotype. We approximate the 
evolution of valley genealogies as a branching process 
and use this representation to calculate the average 
barrier cros sing time. Third, with this analysis com- 
plete, Sec. V_E then addresses the transition from 



the fitness-barrier regime to the entropic one. 



A. Metastable Quasispecies 



Each evolutionary run, the population starts out 
concentrated at the peak genotype EL After a relax- 
ation phase, assumed to be short compared to the 
barrier crossing time, there will be roughly constant 
proportions of the population on the peak and in the 
valley. We now calculate the equilibrium proportion 
Pn of peak individuals and the population's average 
fitness (/), after this relaxation phase. 

To first approximation, one can neglect back mu- 
tations from valley individuals back into the peak 
genotype. First of all, if a ^S> 1, selection keeps the 
bulk of the population on the peak. Additionally, 
valley individuals produce fewer offspring than peak 
individuals and they are unlikely — with a probabil- 
ity 1 / L at most — to move back onto the peak when 
they mutate. In this regime, the quasispecies dis- 
tribution is largely the result of a balance between 
selection expanding the peak population by a factor 
of a /(f) and deleterious mutations moving them into 
the valley with probability 1 — (1 — /i) L . The result is 
that we have a balance equation for the proportion 
Pn of peak individuals given by 



Pn 



(1 - M ) L Pn . 



From this we immediately have that 
(f) = o-{l-ri L ■ 



(1) 



(2) 



Since we also have that (/) = ctPq + 1 • (1 — Pn), we 
can determine the proportion of peak individuals to 
be: 



Pn - 



(3) 



For parameters where (/) = cr(l — n) L 3> 1, Eqs. 
(||) and (||) give quite accurate predictions for the 
average fitness and the proportion of individuals on 
the peak. 

In cases where (/) is close to 1, a substantial pro- 
portion of the population is located in the valley 
and back mutations from the valley onto the peak 
must be taken into account. To do this, we intro- 
duce the quasispecies Hamming distance distribu- 
tion P = (Po, . . . , Pi, . . . , Pl), where Pi is the pro- 
portion of individuals located at Hamming distance 
i from n. Thus, Pq = Pn indicates the proportion 



G 



on the peak. Under selection, the distribution P 
changes according to: 



DS ei_ (<r -l)&m + l n 
1 ~ (/) 1 



(4) 



where Sij = 1, if i = j, and Sij = 0, otherwise. We 
can write this formally as the result of an operator 
acting on P: 



ps 



S • P 



where 



Sij = [{a - l)8 i0 + 1] 6. 



(5) 



(6) 



defines the selection operator S. 

Next, we consider the transition probabilities My 
that under mutation a genotype at Hamming dis- 
tance j from the peak moves to a genotype at Ham- 
ming distance i from the peak. We have that: 



M 



L-j j 

tt=0 d=0 

xfj, u+d (l- (i) L - u - d 



u ) \d 



(7) 



That is, Mij is the sum of the probabilities of all 
possible ways to mutate u of the L — j bits shared 
with n and d of the j bits that differ, such that 
j + u — d = i. Equation (^) defines the mutation 
operator M. 

We can now introduce the generation operator 
G = M ■ S. The equilibrium quasispecies distri- 
bution P is a solution of the equation 



P 



G-P 



(8) 



In this way, the quasispecies distribution is given 
by the principal eigenvector, normalized in proba- 
bility, of the matrix G; while the average fitness (/) 
is given by G's principal eigenvalue. Note that this is 
conventional quasispecies theory 0, apart from the 
facts that we have grouped the quasispecies mem- 
bers into Hamming-distance classes and that we con- 
sider discrete generations, rather than continuous 
time. 



B. Valley Lineages 



Under the approximation that back mutations 
from the valley onto the peak can be neglected, a 
roughly constant proportion I — Pu of valley indi- 
viduals is maintained by a roughly constant influx 
of mutants from the peak. Every generation, some 
peak individuals leave mutant offspring in the val- 
ley. Additionally, each valley individual leaves on 
average a fraction I / (/) offspring in the next gener- 
ation, as can be seen from Eq. (0). This means that 
the fraction of valley individuals, for which all of its 
t ancestors in the previous t generations were val- 
ley individuals as well, is only For (/} 3> 1, 
this implies in turn that whenever a peak individual 
seeds a new lineage of valley individuals by leaving 
a mutant offspring in the valley, this lineage is un- 
likely to persist for a large number of generations. In 
other words, lineages composed of valley individuals 
are short lived. 

Intuitively, the idea is that the preferred selection 
of peak individuals leads to a "surplus" of peak off- 
spring that spills into the valley through mutations. 
Each mutant offspring of a peak individual forms the 
root of a relatively small, i.e., short-lived, genealog- 
ical tree of valley individuals. The barrier crossing 
time is determined essentially by the waiting time 
until one of these genealogical bushes produces a de- 
scendant that discovers the portal. These processes 
arc illustrated in Fig. 0. 




Time -> 

FIG. 2. Genealogies during fitness-barrier crossing. 
An example genealogical tree is sketched for peak in- 
dividuals (above); they have fitness a and are copies of 
the peak genotype II. The valley individuals (below) at 
lower fitness occur in genealogies that are seeded (dashed 
lines) from peak individuals. These genealogies are rel- 
atively short-lived bushes. Evolution continues until the 
time at which one of the valley bushes discovers the por- 
tal genotype Q. 
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We will analyze the evolution of a valley lineage 
as a branching process . The probability O n that 
one particular valley individual leaves n offspring in 
the next generation is given by a binomial distribu- 
tion. This is well approximated by a Poisson distri- 
bution as follows: 

To a good approximation, we may treat the evo- 
lution of each valley lineage as independent of the 
other valley lineages. Under this approximation, 
each valley individual independently has a distribu- 
tion of offspring given by Eq. (^). Of course, under 
fixed population size, the independence assumption 
may break down when valley lineages dominate the 
population. 

We now calculate the probability that a valley lin- 
eage produces a descendant that discovers the portal 
f2 before the lineage goes extinct. Let a valley lin- 
eage be founded by an ancestor in the valley that is 
located at Hamming distance j from f2. We denote 
by pj (t) the probability that t generations from now, 
none of this ancestor's descendants will have discov- 
ered the portal. This probability can be determined 
recursively, in terms of the probabilities piit — 1) as 
follows, 

L 

p j {t)=0 Q + 1 ^p i {t-l)M ii (10) 

i=l 

L 

+ 2 2 " l)M«Pfc(* - !) M ^ + • • ■ ■ 

i,k=l 

The first term in the above equation corresponds 
to the ancestor having no offspring. This, of course, 
ensures that the portal will not be discovered t gen- 
erations from now, since leaving zero offspring im- 
plies that the genealogy goes extinct immediately. 
The second term corresponds to the ancestor having 
one offspring, at Hamming distance i from the por- 
tal, that will not give rise to discovery of the portal. 
That is, since this offspring itself forms the ancestor 
of a new valley lineage, pi(t — 1) gives the probabil- 
ity that none of its descendants discovers the portal 
within the next t— 1 generations. The third term cor- 
responds to the ancestor having two offspring, one 
at distance i from the portal and one at distance k, 



neither of which give rise to the discovery of Cl. The 
higher-order terms in Eq. ( |Io| ) correspond to the 
ancestor having 3, 4, and more offspring. 

Recall that the mutation operator My , as defined 
in Eq. ([?]), gave the probability to go from Ham- 
ming distance j to distance i from the peak under 
mutation. M t j appears above in Eq. ( |l0| ) with a 
different, but equivalent, meaning: My there gives 
the probability to go from a Hamming distance j to 
a distance i from the portal under mutation. This 
use of My appears repeatedly in the following. 

Using Eq. (^|) we can sum the series in Eq. (JTc|) , 
obtaining: 

Pj{ t) = e ([p(*-i)-M], -!)/</> ^ (n) 

where pit) — ipi(t),p2(t), ■ ■ ■ ,Ph{t)) and the vector 
notation denotes the sum 

L 

[p(i-l)-M] j= 5>(t-l)M y - . (12) 
i=i 

For (/) > 1, a valley genealogy eventually either 
discovers il or goes extinct; see, for instance, Ref. 
full . Letting t — > oo in Eq. ([Tl|), we obtain a set of 
nonlinear equations for the asymptotic probabilities 
pj that a genealogical bush, whose founder started 
at Hamming distance j from Q, goes extinct before 
any of its descendants discovers the portal. These 
are given by 

Pj = e ([P-ML-i)/</> . (13) 

Equations (|l|) appear to be unsolvable in closed 
analytical form. Their solutions may be numerically 
approximated in a straightforward manner; for in- 
stance, by simply iterating Eq. (|TT|). However, in 
the regime where fi is small and (/) is not too close 
to 1, the probabilities pi are generally very close to 
1. In this regime, one may expand Eq. ( |l3| ) to first 
order around pi = 1. To do this, we introduce the 
probabilities Cj = 1 — pi that the portal does get dis- 
covered by the lineage before it goes extinct. To first 
order in we obtain from Eq. ( |l3| ) the equations 
given by 

e 3 = 1 - e- M °,/</> (l - , (14 ) 

where e = (ei, . . . , ej,). These equations can be eas- 
ily inverted, yielding: 
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i=l 



-M 0i /</> 



(i-R)- 



(15) 



where I is the identity matrix and the matrix R has 
components 



R, 



M 



!i e -M 03 /</> 



(16) 



Note that the indices i and j in the matrices run from 
1 to L, corresponding to ancestors at Hamming dis- 
tances between 1 and L from the portal. Note also 
that, by definition, e = 1. 

To first order, Eqs. (JlJ) give the probabilities Cj 
that a valley lineage, founded by an ancestor at a 
Hamming distance j from f2, discovers the portal 
before the lineage goes extinct. Now to calculate 
the barrier crossing time we just have to determine 
the number of new valley lineages that are founded 
per generation. 



C. Crossing the Fitness Barrier 



this always corresponds to a new lineage in the val- 
ley. For j — w, we must discount for the probability 
that the peak individual did not undergo any muta- 
tions at all. This is given by the term (1 — /j,) L 5j W . 

Putting these together, we find the probability 
P scl , that a selection does not seed a lineage leading 
to the portal, is given by 



pscl pnot 



seed 



(19) 



Using Eqs. @, @, and @ and the identity 
(/} = 1 + (cr — l)Pn, Eq. ( |l9| ) can be rewritten as 



P 



>scl 



1 



<7((/)-l) 

(*-!)</) 

x [(l- Ai ) L e^ + (/)log(l-e u ,)] . (20) 



Expanding the logarithm to first order in e w , and us- 
ing the approximation in Eq. (|^) for (/), we obtain 
the simple expression 



1 



>«/>-!)■ 



(21) 



Every generation, M individuals are selected in 
proportion to their fitness. Each such selection may 
lead to the seeding of a new lineage in the valley. 
The probability p not that a selection will not lead 
to the founding of a new valley lineage is given by: 



pnot 



1-Pa 

(/) 



(17) 



The first term corresponds to selecting a valley in- 
dividual, that by definition is already on a lineage. 
The second term corresponds to a peak individual 
being selected and reproducing without mutation, 
leaving an offspring on the peak. 

The probability P<> ccd that a new lineage will be 
seeded in the valley and at a Hamming distance j 
from f2 is given by 

Pr d = ^Pn[M JW -(l-^) L S, w ] , (18) 



where w is the Hamming distance between the peak 
and the portal. The first factor, crPn/(f), gives the 
probability that a peak individual will be selected. 
The term Mj W is the probability that under muta- 
tion this peak individual moves from Hamming dis- 
tance w to distance j from the portal. For j ^ w 



The probability that none of the M selections 
from the current generation seeds a lineage that dis- 
covers the portal is simply (P scl ) . By our as- 
sumption of a roughly constant quasispecies distri- 
bution, this probability is constant. Thus, the ex- 
pected number (t) of generations until a lineage will 
be seeded that discovers the portal is given by 



(*> = 



1 



1 



(p S el) M M((/}-lK 



1 - (P s 



where e w is given by Eqs. (151) and (p_ 



(22) 



Equation (|22[) constitutes our theoretical predic- 
tion for the average barrier crossing time (t) as a 
function of the population size M, the fitness differ- 
ential a between the peak and the valley, the muta- 
tion rate /i, the string length L, and the width w of 
the fitness barrier. To obtain it, we made several ap- 
proximations. We assumed that a was large enough 
and n small enough such that (/) was substantially 
larger than 1. Under those assumptions, lineages in 
the valley are short lived, the total number of indi- 
viduals in the valley will be small with respect to M, 
and the probabilities ej will be small. This justifies 
our leading-order expansion for (t). 
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D. Additional Time in Valley Bushes 



Equation (22j) gives the average number (t) of gen- 
erations until a lineage is founded that discovers the 
portal. The actual average time until the portal is 
discovered is somewhat longer, since the lineage that 
finds the portal itself takes a certain average number 
of generations to discover the portal. Specifically, 
there is an additional average time, that we denote 
by {dt} , between the founding of the first lineage that 
discovers the portal and the actual discovery of O. 

We can directly approximate this correction term 
{dt) when the Ci are small. As we will see below, 
generally {dt} <C {t) in the parameter regime where 
our approximations arc valid. This makes the effect 
of including the correction term {dt) rather small in 
these parameter regimes. However, as we approach 
the parameter regime where the ti become large, the 
average number of generations {t} until the found- 
ing of the lineage that discovers the portal becomes 
comparable to the average number of generations 
{dt) that it takes this lineage to actually discover 
the portal. In this (limited) parameter regime, in- 
cluding the correction term {dt) leads to a significant 
improvement of our theoretical predictions. 

Paralleling the development of the Eq. (|l3|), we 
start by expanding Eq. ( pd| ) to first order in ej(i); 
the probability that the lineage starting at distance 
j has discovered the portal by time t. We find that 



-M 0j /</> 



\e(t- 1) - Ml .' 



The expected additional time (dtj), given that the 
lineage started at a Hamming distance j from fl and 
conditioned on the lineage discovering the portal, is 
formally given by 

= E^^, (24) 



t=o 



where the asymptotic tj is given by Eqs. @ and 
(O). Using Eq. ( p3| ) and the boundary conditions 
ejjb) = 0, the above sum gives: 



(d tj ) = ^(l-e-W)) ( l-R) 



(25) 



where the matrix R is again defined by Eq. ©. 

In order to obtain {dt) we have to weigh each of 
the times {dtj) with a factor Cj corresponding to the 
relative proportion of times that a lineage starting 
at Hamming distance j discovers the portal. That 
is, averaged over an ensemble of runs, Cj is the pro- 
portion of times that the portal was discovered by 
a lineage that started at Hamming distance j. The 
weight Cj should be proportional to both the proba- 
bility ej and the rate of creating of lineages at Ham- 
ming distance j from the portal. We have that 



ej (M jw - (1 - y) L 5 ]W ) 

J2k e k ( M fc» - (i - 

3=0,1,..., 



(26) 



where the factors in parentheses are similar to that 
found in Eq. (18). It should be noted that here 



the indices run from to I and not from 1 to L, 
since the portal may also be discovered by a jump 
mutation directly from the peak. 

Combining Eqs. ( ^H| ) and (|2^ ) and using Eq. (|l5|), 
we find that the average length of the valley bush 
that discovers the portal is 



{dt) = 



[ e .(I-R).(M-I(l-^)] t 
[e • (M — 1(1 - tf)] w + M 0w 



(27) 



where, again, e is given by its components in Eqs. 
( |l~5| ) and (16). The indices in the vector notation 
now run from 1 to L. 

Adding the correction term {dt) to {t) as given 
by Eq. (|22j) improves our theoretical predictions es- 
pecially in the regime where the become large. 
However, we still expect the approximations leading 
to the above equations for (t) and {dt) to break down 
when (/) — » 1 + . 



E. Theory versus Simulation 



We simulated an evolving population using a fit- 
ness function consisting of a single barrier, as de- 
scribed in Sees. || and III , for a wide range of param- 
eter settings to quantitatively test our theoretical 
predictions. Results for several parameter regimes 
are shown in Fig. |^, where the simulation results 
are plotted using dashed lines and the theoretical 
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predictions are plotted with solid lines. Each data 
point on the dashed lines was obtained by averaging 
over 250 runs with equal parameter settings. The 
theoretical predictions are shown as pairs of solid 
lines, where the lower solid line in each pair shows 
the predictions from Eq. (|||) and the upper solid 
line shows Eq. (^2|) plus the correction term of Eq. 
(p7|). Note that for most parameter ranges the dif- 
ference between the two solid lines is so small as to 
be undetectable. 

Figures ||(a) and |^(b) show the average barrier 



crossing time (t) as a function of the logarithm 
log(er) of the barrier height. Additionally, both (t) 
and log(cr) are plotted using a logarithmic scale. The 
shapes of the curves correspond to the dependen- 
cies of log(i) on log(logfi). Portions of curves that 
are straight lines thus indicate a scaling of the form 
(t) oc (log a) 3 , with s the slope of the straight por- 
tion. Note that (t) ranges over 5 orders of magni- 
tude, from 10 to 10 6 , in both Figs. |(a) and |(b). 
We see that the theory accurately predicts the sim- 
ulation results for barrier heights that are not too 
small. 
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FIG. 3. Barrier-crossing times (t) as a function of barrier height a, mutation rate and barrier width w, for a 
variety of parameter settings. The simulation results are plotted using dashed lines. Each point on each dashed line 
is an estimate of (t) averaged over 250 simulation runs. The theoretical predictions are shown as pairs of solid lines: 
The lower of each pair gives the theoretical predictions of Eq. while the higher has the additional correction 

term of Eq. (E^) added. Note that, except for the horizontal axis in Fig. (d), all axes use a logarithmic scale. General 
parameter settings are indicated at the top of each plot, while parameters specific to the different runs are indicated 
next to the their lines. 



In Fig. ||(a) the theory starts deviating from the 
experimental data around log(cr) ~ 0.06 for the up- 
per two curves and around log(cr) rs 0.15 for the 
lowest. These values of a correspond to selective 



advantages a — 1 of the peak of a little over 6 
and 16 percent, respectively. Notice that the upper 
two experimental curves are almost horizontal for 
small values of a up to log(<r) ~ 0.06, after which 
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they trend upwards becoming almost linear. As we 
will show below, it turns out that the location of 
this crossover is found at the finite-population error 
threshold that separates the entropy-barrier regime 
from the fitness-barrier regime. That is, for the pa- 
rameters M — 250, fi = 0.005, and L = 10, the crit- 
ical value a c below which the population dynamics 
acts effectively as if there were no fitness peak at 
all occurs around log(er c ) rj 0.06. The same phe- 
nomenon is observed in the two upper curves of Fig. 
||(b): the crossover occurs around log(er c ) w 0.05. 
Note that the correction terms (dt) extend the pa- 
rameter region over which the theory provides ac- 
curate predictions approximately up to the finite- 
population error threshold. 

Above the error threshold, for values of a in the 
fitness-barrier regime, the curves appear nearly lin- 
ear. This indicates that the barrier crossing times 
scale with powers of the logarithm of the barrier 
height a: (t) oc (log<r) s , where s is the line's slope. 
Thus, the barrier crossing time increases relatively 
slowly as a function of the barrier height. One fur- 
ther observes that the barrier crossing times are not 
only longer for wider barriers (larger values of w), 
but that the slopes of the curves are larger as well. 
That is, for large widths w the barrier crossing time 
increases faster as a function of a than for low values 
of w. 

Figure [|(c) shows the barrier crossing time (t) as 
a function of the mutation rate fi, for three differ- 
ent values of the barrier width w and two different 
values of the barrier height a. The population size 
is M = 250 and the genotypes length L = 7 for all 
three curves. On the logarithmic scales, the curves 
again look approximately linear, indicating that the 
barrier crossing time scales as a power law in the 
mutation rate /i: (t) oc /i s , where s is the slope. 
We again see that for wider barriers, the waiting 
times are both larger and vary more rapidly with 
fi. The theory predicts the simulation results quite 
accurately over the entire range. Only for large mu- 
tation rates (fi f» 0.01) do the theoretical predic- 
tions with and without the correction term of Eq. 
( p7j ) differ significantly. In this regime the theoret- 
ical and experimental values start to differ slightly 
as well, although the predictions are still accurate. 
It is notable in the two lower curve families, with 
barrier widths w = 2 and w = 3, that the correc- 
tion term (dt) improves the theoretical predictions 
for high mutation rates. 

Finally, Fig. |^(d) shows the barrier crossing time 



(t) as a function of the barrier width w. Only the 
barrier crossing time (t) is shown on a logarithmic 
scale, so that any linear dependence indicates an ex- 
ponential scaling: (t) oc 10 sw , where s is the slope. 
Again, the theory accurately predicts the barrier 
crossing time. The fact that the curves are not lin- 
ear and bend downwards shows that (t) grows more 
slowly than exponential with barrier width; although 
it still increases rapidly as a function of w. In fact, 
we chose large values of the mutation rate fi in these 
plots (/i = 0.01 and \i = 0.015) to ensure that the 
barrier crossing time is still in a reasonably bounded 
range up to large barrier widths. For smaller muta- 
tion rates, the barrier crossing times become so large 
as to make it impossible to perform an adequate 
number of simulation runs. For the case w = 1, 
the correction term (dt) leads to an overestimation 
of (f). Note, however, that for w = 1 there is ef- 
fectively no fitness barrier; the portal is a mutant 
neighbor of the peak genotype and so valley bushes 
are essentially nonexistent. 

In summary, Figs, ^(a)-(d) show that the theoret- 
ical predictions of Eq. (22), possibly including the 
correction term of Eq. (27), accurately predict the 
average barrier crossing times estimated over a wide 
range of parameters from simulations of an evolving 
population. The theory breaks down, as expected, 
when the barrier height a becomes small (a ~ 1) 
and this is illustrated on the left-hand sides of Figs. 
||(a) and ||(b). In this low-er regime, which sets in 
suddenly as a function of a, (t) becomes almost inde- 
pendent of a. Roughly speaking, the selection pres- 
sure is too small to keep the population concentrated 
around the peak, and the population randomly dif- 
fuses through the valley until it discovers the portal. 
In this regime, the barrier is in effect not a fitness 
barrier, but an entropy barrier. 



In Sec. IV we will analyze the location of the 



finite-population error threshold that separates the 
fitness and entropy barrier regimes and discuss the 
entropy-barrier crossing population dynamics. In 
the next subsection, though, we first discuss the scal- 
ing of the fitness-barrier crossing time (t) with the 
different parameters a, w, (j,, and M. 



F. Scaling of the Barrier Crossing Time 

In Fig. U we saw, by varying one parameter at a 
time, that the barrier crossing time scaled as a power 
law in the logarithm of the barrier height log(cr), 
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as a power law in mutation rate [i, and somewhat 
slower than exponential in the barrier width w. An- 
alytically extracting these scalings from Eq. ( p2| ) is 
quite challenging and incomplete at this time. Em- 
pirically, though, we found that the barrier crossing 
time can be fit quite accurately, in the regime where 
a is not too small (above the error threshold), to a 
scaling function with the following form 



w\M 



log(cr) 



(28) 



[log(cr; 



i-7-rSlogGu) 



where 7 and 5 are (constant) scaling exponents. For 
both the genotype lengths (L — 7 and L — 10) for 
which we have detailed data, we found that 7 f=a 0.75 
and 5 » 0.1. 

This empirical scaling law confirms that, in fact, 
the barrier crossing time (t) scales as a power law in 
both log(cr) and fx. We see, in particular, that the 
dependence on the mutation rate fi scales roughly in- 
versely with fx w and the dependence on log(cr) scales 
roughly as [log(cr)]"'~ 2 . Furthermore, we see that (t) 
scales as e cw /w\, with c a constant, when only the 
barrier width w is varied. The scaling with w is thus 
by far the most rapid and therefore dominant scaling. 
That is, widening the barrier increases the waiting 
time (t) much more than increasing the height of the 
barrier or decreasing the mutation rate. 

These empirically observed scaling behaviors can 
be elucidated using a simple analytical argument. 
To this end we employ several simplifications. First, 
we assume that the major contributions to the prob- 
ability of barrier crossing come from terms with the 
minimal number of mutations. That is, for barri- 
ers of width w, at least w mutations must occur in 
a peak individual in order to discover the portal. 
Thus, we assume that contributions from "paths" 
between peak and portal that involve more than 
w mutations are negligible. This for instance im- 
plies that we neglect the contributions from lineages 
founded at Hamming distances w through L from 
fl. Furthermore, we assume that valley lineages are 
unlikely to be founded more than 1 mutation away 
from the peak. Putting these together, the domi- 
nant contribution to the barrier crossing probability 
comes from lineages that are founded at a Hamming 
distance w — 1 from the portal. Note that if we 
set Pjj = 1 for simplicity, each generation approxi- 
mately 



such lineages are founded. 



(29) 



We will now estimate the probability that a lin- 
eage, starting at Hamming distance w — 1 from the 
portal, discovers the portal exactly t generations af- 
ter its founding. We approximate the valley genealo- 
gies by assuming that each valley individual can only 
have zero or one offspring each generation. This im- 
plies that a valley lineage consists of a single line of 
individuals; i.e., lineages do not branch. The proba- 
bility that such a lineage persists for at least t time 
steps is l/(/) . At t = 0, the lineage has w—1 bits set 
incorrectly, and L — w + 1 bits set correctly. In order 
for the lineage to discover the portal exactly at time 
t, it will have to mutate its bits such that, at time t 
and for the first time, the w — 1 "incorrect" bits will 
all have been flipped to the correct state and all the 
L — w + 1 correct bits are left undisturbed. Thus, 
between time and t, the w — 1 incorrect bits have 
to be mutated exactly once, while the correct bits 
have been undisturbed. Since we are calculating the 
probability for the portal to be discovered exactly at 
time t, one of the w — 1 bits has to flip at time t, 
while the other w — 2 might flip at any prior time. 
This gives (w — 1) t w ~ 2 possibilities for contributing 
flips. All other bits have to remain unflipped for all 
time steps. 

Thus, the probability P t find that a lineage finds the 
portal exactly at time t is approximately given by: 



P, 



find 



if 



w-2 



x ii w -\l- Li) Lt - w+1 ( j- 



1 

(7) 



(30) 



where the last factor gives the probability that the 
lineage survives until time t. Using Eq. (|^) and 
summing Eq. (BO) over t we find: 



(w-1) 



(w-1) 




00 , 



Li 2 _, 



(31) 



where Li„(iE) is the poly- logarithm function: essen- 
tially defined by the sum in the second line above. 
It is more insightful to approximate the sum with an 
integral. We then obtain 
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p 



find _ / A* 



(io-l)i ,0 - 2 e- Iog W*dt 



(32) 



log(a)(l- M ) 



Recall that the rate at which lineages at Hamming 
distance w — 1 are being created is wfxM/ (1—fi). Us- 
ing this and noting that the barrier crossing time is 
inversely proportional to p find ; we obtain a scaling 
of the form 



(t) oc 



1 



wlMfj, 



log(cr) 



(33) 



where we have neglected the factor (1 — [i) w « 1. 
The scaling relation of Eq. (j33|) recovers most of the 



empirically determined scaling behavior in Eq. (28) 



The dominant scaling with fi and logger) can be 
understood as follows. The average time that a 
lineage spends in the valley before going extinct is 
roughly l/log(er). Thus, /i/log(cr) gives the average 
number of mutations that a lineage in the valley un- 
dergoes before it goes extinct. Since this number is 
generally much smaller than 1, it can be interpreted 
as the probability of having a single mutation in a 
valley lineage. The probability of having w — 1 mu- 
tations is then of course (n/ log(a)) w ~ 1 . There is 
an additional factor from the rate at which val- 
ley lineages are being created at Hamming distance 
w — 1. Ref. |3l| also argues, along somewhat differ- 
ent lines, that the barrier crossing time should have a 
power-law dependence on mutation rate: (t) oc [i~ w . 

The correction factors — those with scaling expo- 
nents 7 and S in Eq. (|2^) — probably arise from the 
fact that lineages are not simple unbranching lines 
of descendants, as we have assumed, but are more 
complicated tree-like genealogies. 

The factor w\ in Eq. (^) counts the number of 
distinct paths of minimal length between the peak 
and the portal. Curiously, it appears from the scal- 
ing formulas that when w gets very large, the bar- 
rier crossing time starts to decrease again. Apply- 
ing Stirling's approximation to the factorial func- 
tion in Eq. (j33|) indicates that (t) has a maximum 
around wfi = log(cr). Although this may initially 
seem strange, it does make sense, since as we will 
now argue, fitness barriers for which Wji > log(cr) 
do not exist. 

If there are w\ independent paths between peak 
and portal, this implies that there are w indepen- 



dent directions from the peak into the valley. In 
other words, at least w bits of the peak genotype 
can undergo deleterious mutations. As we will see 
in Sec. IV below, the error threshold at which peak 
individuals becomes unstable in the population oc- 
curs near 



a(l-»Y 



1 



(34) 



To first order in (i, this is equivalent to log(cr) = L/i. 
That is, if L/i > log(cr), peak individuals will be lost 
from the population, and the population will start 
diffusing randomly through genotype space. Obvi- 
ously, the genotype length has to be longer than the 
barrier width L > w. Therefore, wfi > log(er) im- 
plies that the genotype length L is so large that it is 
impossible to stabilize the peak individuals. In other 
words, fitness barriers with wfi > log(er) simply do 
not exist. 

Finally, it should be noted that as log(er) — > oo, 
the barrier crossing time goes to a finite asymptote 
and not to infinity. Since valley lineages have proba- 
bility zero to reproduce in this limit, the asymptotic 
barrier crossing time is given by the (finite) waiting 
time for a "long jump" in which a peak individual 
undergoes w mutations at once. 

The main consequence of the scaling relations just 
derived, is that if the population is located on a fit- 
ness "plateau" in genotype space, surrounded by dif- 
ferent valleys on all sides, then it will most likely es- 
cape from the plateau via the valley with the small- 
est width and not via the valley path with the small- 
est depth. One concludes that high barriers can be 
passed relatively easily, as long as they are narrow; 
while wide barriers take a very long time to cross, 
even if they are shallow. 

We should emphasize that this situation is very 
different from the scaling of barrier crossing times 
generally encountered in physics or, for that mat- 
ter, in evolutionary models that literally interpret 
the "landscape" metaphor as leading to stochastic 
gradient dynamics on a fitness "potential" . In these 
settings, the system's state space has an energy func- 
tion defined on it that acts as a potential field. In the 
absence of any noise, the system is assumed to follow 
the gradient (downward) of the energy "landscape" . 
In the presence of noise, the system can deviate from 
its gradient path, but movement against the gradi- 
ent is unlikely in proportion to its deviation from 
the local gradient. The barrier crossing times then 
depend mainly on the barrier height, and they scale 
exponentially with this barrier height |L2] . 
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For example, imagine an energy barrier that con- 
sists of a steep slope upwards, followed by a long 
plateau and then a steep slope leading downward on 
the other side. The initial steep ascent from the val- 
ley onto the plateau is very unlikely since it involves 
moving against a steep gradient. However, after this 
unlikely step has been established, the system can 
cross the long plateau to the other side relatively 
easily, since it does not involve moving against an 
energy gradient. Thus, the width of the energy bar- 
rier is almost immaterial, while the barrier height is 
the defining impediment, since it determines the ex- 
tent to which movement against the gradient must 
occur. 

The situation is entirely different for fitness "land- 
scapes" in which an evolving population moves. For 
an evolving population, making a large jump in fit- 
ness is not unlikely at all. One mutation in one indi- 
vidual can do the trick. However, since some individ- 
uals remain at the peak, the individuals in the valley 
are continuously in competition with these higher- 
fitness peak individuals. An absolute fitness scale is 
set by these peak individuals. It is therefore survival 
at low fitness — compared to the most-fit individuals 
in the population — for an extended period of time 
that is unlikely. And this is why the time it takes 
to move across the plateau is the key parameter — 
which, of course, is controlled by the barrier width. 

The preceding discussion should make it clear, 
once again, that this analogy — between a population 
evolving over a fitness "landscape" and a physical 
system moving over its energy "landscape" in state 
space — is problematic: at best it may lead one to the 
wrong intuitions; at worst the basic physical results 
simply do not describe evolutionary behavior. 



IV. THE ENTROPY BARRIER REGIME 

Figures ||(a) and ||(b) showed that below a critical 
barrier height a c , where the dashed lines began to 
run horizontally, the barrier crossing time became 
effectively independent of a. We also saw that the 
theory breaks down for a < a c . In this regime, all 
peak individuals are quickly lost and the population 
diffuses through the valley until the portal is discov- 
ered. The theoretical calculations, in contrast, as- 
sumed the population was located at the peak and 
that short lineages were continuously spawned in the 
valley. It is no surprise then that the predictions 
break down in this regime. Since the population 



dynamics is dominated by diffusing through the val- 
ley's fitness-neutral volume, we refer to this as the 
entropy-barrier regime. Before discussing the barrier 
crossing times in this entropic regime, we first esti- 
mate the error threshold's location <r c as a function 
of /i, M, and L. 



A. Error Thresholds 

As a first, population-size independent approxi- 
mation one might guess that the error threshold oc- 
curs when the average number of peak-offspring pro- 
duced by a single peak individual in a population of 
valley individuals is 1. From Eq. (|^) this leads to 
an estimated critical barrier height a c of 

a e =(l-fi)- L . (35) 

This equation is the standard error-threshold result 
in molecular quasispecies theory HQ]. For the pa- 
rameters L = 10 and fi = 0.005 of Fig. |||(a), 
this leads to log(u c ) « 0.05, or a c ~ 1.05. As 
seen from the figure, though, the entropic regime 
extends to somewhat higher peak fitness; as far 
as log(er) « 0.06. This deviation is due to finite- 
population sampling effects and to neglecting back 
mutations, which become important as the error 
threshold is approached. 

For finite populations, the error threshold can be 
defined most naturally as those parameter values for 
which the mean proportion Pn of peak individuals 
equals the variance, due to finite population fluctu- 
ations, in Pr. That is, the criterion for reaching the 
error threshold is 

(Pn) 2 = Var(Pn) • (36) 

The intuition behind this definition is as follows: 
Since the proportion of peak individuals fluctuates, 
eventually a large fluctuation will occur that leads to 
the loss of all peak individuals. As was shown in Ref. 
[^o| , however, the waiting time for such a destabi- 
lization to occur increases exponentially with the ra- 
tio (P n ) 2 /Var(P n ). Only when {Pn) 2 < Var(P n ) do 
such destabilizations occur relatively frequently. For 
(Pn) 2 > Var(Pn) the fluctuations are small enough 
so that the proportion of peak individuals typically 
does not vanish. Therefore, it is natural to use Eq. 
( p6| ) to delineate the regimes with "unstable" and 
"stable" peak populations and so to distinguish be- 
tween the fitness-barrier and entropy-barrier regimes 
in the population dynamics. 
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Finite-population error thresholds may also be de- 
fined in alternative ways; cf. Ref. (24). Typically, 
one finds that, although the conceptual motivations 
differ, the quantitative parameter values for which 
the different error thresholds occur are quite simi- 
lar. 

The variance Var(Pn) can be most easily cal- 
culated using diffusion-equation methods. For an 
introduction to these techniques in the context of 
mathematical population genetics, see for instance 
Rcf. p8[ . To begin, we assume that, due to sam- 
pling fluctuations, at some particular time t the ac- 
tual proportion of peak individuals is not Pn but 
instead is P(t) — Pn + x(t). That is, the proportion 
P(t) of individuals on the peak deviates x(t) from 
its equilibrium value Pn- We focus on the dynamics 
of the deviation x(t). At the next generation, the 
expected deviation (x(t + 1)} is 



(x(t + l)) 



*(*) 
(/) 



(37) 



Thus, the expected change (Sx) in the deviation is 
given by 



= If) X = ~ 1X 



(38) 



where we have defined 7 by the last equality. 7 
measures the average rate at which fluctuations 
around the quasispecies equilibrium distribution are 
damped. The second moment ((Sx) 2 ) of the change 
Sx is approximately given by the variance of the 
binomial-sampling distribution. One finds that 



((Sx) 2 ) 



1 

M 



Pn 



(/)) 



1-Pn- 



(39) 



A Fokkcr-Planck diffusion equation approxima- 
tion determines the temporal evolution of distribu- 
tion Pr(x, i) of x(t) via 



-Pr(x,t)= - —(Sx)Pr(x,t) 



1 d 2 



((5x) 2 )Pr(x,t) 



(40) 



where (Sx) from Eq. (^) gives the drift term and 
((Sx) 2 ) from Eq. ( p9| ) the diffusion term. Solving 
for the limit distribution Pr(x) for x yields 



Pr(x) = C ( Pn + yjy 



2Af</>((/)-l)Pn 



2M</>«/>-l)(l-Ffr) 



x 1 - Pn - 



(41) 



Here C is a normalization constant that ensures 
Pr(x) is normalized on the interval x G [— Pn, 1 — 
Pn]- If we expand the fluctuations to second-order 
around x — 0, the distribution becomes a Gaussian 
given by 



Pr(z) = Ce iic-^n) 1 , 



(42) 



where C is again a normalization constant. From 
this distribution of fluctuations one directly reads 
off the variance Var(Pn), finding that 



Var(Pn) 



Pn(l-Pn) 

2M 7 
(/) (cr-(f)) 
2M((j - l) 2 



(43) 



and pq) to arrive at the last 



where we used Eqs. 
line. 



As noted before, we define the finite-population 
error threshold by those parameter values for which 
(P n ) 2 = Var(Pn). Using Eq. (|||) leads to the error- 
threshold parameter constraints given by 



2M((f) - l) 2 
</>(*-</» 



1. 



(44) 



If we substitute the parameter values = 0.005, 
M = 250, and L = 10 of Fig. |(a) and use 
Eq. (Q) for (/), we find the error threshold at 
log(er c ) w 0.059. This agrees quite well with the lo- 
cation at which the experimental curves start bend- 
ing upwards with increasing peak height. 



B. The "Landscape" Regime 



As we have pointed out previously, the scaling re- 
lations derived in Sec. Ill F contrast strongly with 



those based on "landscape" models in which the pop- 
ulation as a whole diffuses through the fitness land- 
scape 20 2^]. For those models, the barrier cross- 
ing time scales exponentially with population size 
and barrier height. It turns out that this scaling 
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behavior — appropriate to the "landscape" regime — 
can be reconciled with the scaling formulas derived 
in Sec. Ill F by closer inspection of Eq. (J44). 



As noted above, the average destabilization time 
for a fluctuation to occur that makes all peak indi- 
viduals disappear from the population scales expo- 
nentially in the ratio Var(Pn)/(-Pn) 2 given by Eq. 

Thus, Eq. (UJ) shows that this destabilization 
time increases exponentially with population size. 
For cases where (/} » 1 and for reasonable popu- 
lation sizes, the destabilization time is so large that 
the barrier crossing time is determined by how long 
it takes a rare mutant to cross the fitness valley. 

Close to the finite- population error-threshold 
((/) ~ 1), however, it might be the case that the 
time to create such a rare sequence of mutants is long 
in comparison to the destabilization time. In this sit- 
uation, the barrier crossing time is essentially given 
by the destabilization time: As soon as all peak indi- 
viduals are lost, the population diffuses through the 
valley and quickly discovers the portal. Thus, in the 
very restricted "landscape" parameter regime just 
around the error-threshold, the barrier crossing time 
is determined by the destabilization time and does 
scale exponentially with population size and barrier 
height. 

Beyond the error threshold — that is, for smaller 
populations, larger mutation rates, smaller barrier 
heights, or longer genotypes — the peak readily be- 
comes unoccupied. In this regime, the barrier cross- 
ing time becomes almost independent of barrier 
height er. The barrier to be crossed is then no longer 
a fitness barrier. Instead, it has become an entropy 
barrier. The population must search through almost 
all of the valley until the portal is discovered. Thus, 
only for parameters near the boundary between the 
fitness and entropic regime does the barrier cross- 
ing time scale in accordance with the "landscape" 
models. 



C. Time Scales in the Entropic Regime 



The population dynamics in the entropic regime 
beyond the error threshold is modeled most directly 
by considering an entirely flat (constant) fitness 
function; in particular, one in which all genotypes 
have fitness 1 and containing a single portal Q. The 
population starts out concentrated on a genotype 
at Hamming distance w from fl and evolves under 



selection and mutation until the portal genotype is 
discovered for the first time. Denote this average 
entropy-barrier crossing time by r. 

The calculation of the entropy-barrier crossing 
time appears less analytically tractable than the cal- 
culation of the fitness-barrier crossing time. The 
main difficulty arises from the sampling of individu- 
als at each generation, combined with the global con- 
straint of a fixed population size. Due to this sam- 
pling dynamics, subtle genetic correlations emerge 
between the individuals. Although some of the as- 
pects of the correlation statistics have been derived 
analytically the entropy-barrier crossing time r 
depends in a complicated, and not yet well under- 
stood, way on these correlations. We will discuss the 
difficulties with calculating entropic barrier crossing 
time by deriving several simple approximations and 
discussing why they fail to provide accurate quanti- 
tative predictions. 

First, one can approximate the neutral evolution 
just defined by assuming that each individual in the 
population has exactly one offspring. In this case, 
the population effectively consists of M independent 
random walkers that diffuse through genotype space. 
Since each individual has only one offspring one can 
identify its genealogy with a single evolving genotype 
that mutates each bit with probability /i at each gen- 
eration. Since [iL <C 1 in general, this genotype ef- 
fectively performs a random walk in the hypercube, 
where random walk steps are made at a rate of one 
step per l/(L/i) generations on average. 

The average time t\ a single random walker takes 
to discover Q is given by: 



n = £(i-M) 



1 

IW 



(45) 



i=l 



where the matrix indices run from Hamming dis- 
tance 1 through L. t\ determines an upper bound 
for the entire population's barrier crossing time. For 
parameter settings in the fixation regime, where 
ML/j, <C 1, sampling fluctuations cause the popula- 
tion to converge onto M copies of a single genotype. 
As is well known from the theory of neutral evolution 
[|l9| , this set of identical genotypes performs a ran- 
dom walk through the genotype space at the same 
rate as a single individual. Thus, in this limit, r% 
gives a reasonable prediction for the entropy-barrier 
crossing time. However, for Fig. ||(a)'s parameter 
settings (M = 250, \i = 0.005, and L = 10) that give 
MI/i = 12.5, we find that t\ « 23000, almost inde- 
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pendent of valley width w. Of course, this grossly 
overestimates the observed barrier crossing times, 
which vary from (t) * 25 for w = 2 to (t) ~ 227 
for w = 4. 

For M independent random walkers, one might 
simply assume that the waiting time would be 
roughly a factor M slower, i.e. tm = t\/M. Unfor- 
tunately, this leads to tm ~ 93 which overestimates 
the observed time for w = 2 and underestimates (t) 
for w = 4. 

The precise probability p w (t), that none of M in- 
dependent random walkers starting at a Hamming 
distance w have found the portal by time t, is given 
by: 



M 



p w {t)= \YM 



(46) 



this, it is easy to see, at least qualitatively, that the 
entropy-barrier crossing time is longer than that pre- 
dicted for independent random walkers. The corre- 
lation, or clustering, of individuals in genotype space 
leads the population to explore the valley's neutral 
volume at a slower rate. Thus, the predictions ob- 
tained by assuming M random walkers, as given by 
Eqs. <M) and @), are lower bounds to the actual 
waiting times. 

It turns out that the upper (Eq. fl45|)) and lower 
(Eq. (p7[)) estimates do not tightly bound the ac- 
tual waiting times (t). They may differ by several 
orders of magnitude. Fortunately, the lower bound 
obtained from Eqs. fl46| ) and ([|7|) typically produces 
reasonable order-of-magnitude estimates for param- 
eter regimes in which ML/i > 1. This order-of- 
magnitude estimate gives the following scaling re- 
lation for the entropy-barrier crossing time 



From this, one estimates the average entropy-barrier 
crossing time r to be: 

oo 

TM = /]t\pw(t-l)- Pw{t)} 
t=l 
oo 

= (47) 



MLfi 



(49) 



D. Anomalous Scaling 



For Fig. ||(a)'s parameters, Eq. p7j ) gives r « 15, 
58, and 117 for barrier widths w = 2, 3, and 4, 
respectively. These values underestimate each ob- 
served waiting time by almost a factor of 2. Appar- 
ently, sampling fluctuations cause the population to 
explore the genotype space less rapidly than inde- 
pendent random walkers do. As already noted above, 
the reason for this is that sampling convergence leads 
different individuals to evolve genetic correlations to 
some degree. 

One way to think about this is to investigate ge- 
nealogies. Ref. H showed that the probability Pr(i) 
for two randomly chosen individuals in the current 
population to have had a common ancestor less than 
t generations ago is approximately given by 



Pr(i) w 1 - e- l ' M 



(48) 



This means that, on average, a pair of individuals 
has only undergone ML/i mutations each since the 
time t ~ M they descended from a common an- 
cestor. When Mfi is not much larger than 1, this 
implies that two individuals are more strongly cor- 
related genetically than random genotypes. Due to 



The order-of-magnitude estimate given by Eq. 
19|) predicts that the the entropy-barrier crossing 
time t scales inversely with both fi and M. This 
scaling is, of course, exactly one's intuitive expec- 
tation: the rate at which the genotype space is ex- 
plored is proportional to both mutation rate \x and 
population size M. M individuals cover M times as 
much genotypic "ground" as one individual. Individ- 
uals that "move" twice as fast, cover twice as much 
ground as well. And so, the waiting time should be 
inversely proportional to both M and /x, which set 
the exploration rate. 

In light of this, it is interesting that data from sim- 
ulations shows that the entropy-barrier crossing time 
r scales as a power law in both \x and M, but not 
with exponents equal to —1, as the preceding sim- 
ple argument suggests. To be clearer on this point, 
Fig. [| illustrates the observed scaling behavior of 
the entropy-barrier crossing time as a function of M 
and ijl. 

The solid lines plot the data obtained from simula- 
tions while the dashed lines show scaling (power-law) 
functions that were fitted to the experimental data. 
All axes use logarithmic scales. All simulations were 
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performed with genotypes of length L = 10 bits. In 
all of the runs, at time t = all individuals start at 
Hamming distance w = 5 from the portal. Figure 
||(a) shows r's dependence on M for three differ- 
ent values of /i. The approximately straight lines 
show that the entropy-barrier crossing time depends 
roughly as a power law on the population size M: 



T OC 



1 



(50) 



Of course, the scaling exponent a may itself depend 
on fi. 

Similarly, Fig. ||(b) shows the dependence of r on 
fi for two different values of M. In this case too, the 
curves appear well approximated by a straight line, 
indicating that for fixed M the dependence on [i is 
roughly given by 



I* 



(51) 



where (3 may again depend on the population size M. 
In Table | the exponents of the estimated dashed 
lines in Figs. [|(a) and [|(b) are given, along with 
their estimated errors. 





0.002 


0.005 


0.008 


a 


0.740 ± 0.01 


0.744 ± 0.02 


0.761 ±0.03 


M 


50 


250 




a 


1.292 ± 0.008 


1.365 ±0.014 





TABLE I. Estimated exponents a and (3 as defined by 
Eqs. @ and (Ell). 
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FIG. 4. Entropy-barrier crossing time r as a function of population size M and mutation rate fi. The solid lines 
are data obtained from simulations, while the dashed lines show the estimated scaling functions. All experiments 
were done with genotypes of length L = 10 and a barrier width of w = 5. All axes are shown on logarithmic scales. 
In Fig. (a) r is shown as a function of population size M for three values of the mutation rate u. In Fig. (b) r is 
shown as a function of the mutation rate fi for two values of population size M. The approximately straight lines 
show that t scales as a power law in M when /j, is kept constant and, vice versa, as a power law in fi when M is kept 
constant. Table | lists the estimated exponents for the power laws. These were used to plot the dashed lines. 



The values of the exponents a for different \i and (3 
for different M are very close to each other; a w 3/4 
and (3 ~ 4/3. It is clear, however, that they are not 
constants: a does depend on fi and f3 on M. Note 
that the estimates for a are all below 1, while those 
for (3 are above 1. Thus, doubling the population 
size decreases r by less than a factor of two, while 
doubling the mutation rate decreases r with more 



than a factor of two. Intuitively, what is happen- 
ing is that, due to the clustering in the population, 
doubling the population size does not lead to a dou- 
bling of the exploration rate. That is, some of the 
"added" members in the larger population will sim- 
ply occur at genotypes where other members of the 
population are already located. Thus, they do not 
contribute to additional novel exploration. In con- 
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trast, doubling the mutation rate not only doubles 
the rate of movement (diffusion) of individuals in 
the population, it additionally decreases the cluster- 
ing and so reduces genetic correlations. Due to the 
combination of these two effects, the entropy-barrier 
crossing time decreases more than a factor two. 

Of course, one would like to predict these anoma- 
lous exponents. In principle, they should be calcu- 
lable from knowledge of the clustering structure of 
the population at different values of M and /i. For 
example, if we view the population as a blob or col- 
lection of blobs in genotype space, one would like 
to know how many distinct genotypes, on average, 
are neighboring one or more individuals of the pop- 
ulation. Roughly speaking, we would like to know 
the average "surface area" of the population in geno- 
type space. Knowledge of this statistic would then 
supply us with the average probability that a mu- 
tation leads to a genotype not currently present in 
the population. This, in turn, quantifies the popula- 
tion's rate of exploring novel genotypes, while taking 
into account genetic correlations. Although several 
statistics of these genotype blobs were calculated in 
Ref. §, we have at present not been able to adapt 
these results to infer the necessary type of statistics 
just outlined. The analytical prediction of the scal- 
ing exponents a and (3 thus awaits further progress. 
For the present, we will use our order-of-magnitudc 
estimate in Eq. ( |49| ) to compare the entropy-barrier 
crossing time with the fitness-barrier crossing times. 

In summary, we analyzed the fitness- and entropy- 
barrier crossing times for the simplest (single peak) 
landscapes in which both types of barrier occur. Our 
results are summarized by the scaling relations of 
Eqs. (12§1 ) and (ff9l). In the following sections, we 
apply the preceding analysis to more complicated 
fitness functions that contain multiple fitness and 
entropy barriers. 



V. TRAVERSING COMPLEX FITNESS 
FUNCTIONS 



Up to this point, to make analytical progress we 
focused on fitness functions that were intentionally 
simple: a single portal and a single peak in genotype 
space. Despite this, the analysis of barrier crossing 
times just developed can be extended with relative 
ease to more complicated evolutionary processes. To 
illustrate this extension of the theory, we now intro- 
duce a class of more complicated fitness functions 



that contain multiple fitness and entropy barriers of 
tunable width and height. That is, in this class of 
fitness functions, the population may have to cross 
both a fitness and entropy barrier to escape from 
its metastable state. Since the relative sizes of both 
these types of barriers can be tuned, we can explic- 
itly compare the time scales for crossing fitness and 
entropy barriers within the same evolutionary pro- 
cess. 



A. The Royal Staircase with Ditches 

The class of fitness functions which we call the 
Royal Staircase with Ditches is closely related to 
the Royal Staircase and Royal Road fitness functions 
that we have analyzed previously pU27|-|30|] . Those 
fitness functions did not contain fitness barriers but 
instead consisted of a series of entropy barriers. The 
function class of Royal Staircases with Ditches gen- 
eralizes these fitness functions and is defined as fol- 
lows: 

• Genotypes consist of bit sequences of length 
L, interpreted as N blocks of K bits each: 
L = NK. 

• The blocks are ordered, not in the sense that 
they correspond to particular positions in the 
genotype, but only in the sense that they are 
indexed 1 through N . Note that since our 
evolutionary process does not include recom- 
bination, the population dynamics is invariant 
under arbitrary permutations of a genotype's 
bits. For convenience, we order the blocks from 
left to right in the genotypes. That is, bits 1 
through K belong to the first block, bits K + 1 
through 2K belong to the second block, and so 
on. 

• The 2 K possible configurations of the K-bit 
blocks each are divided into three classes. 

1. Type- A blocks consist of a configuration 
with K ones: 

A = 111- ■ ■111 . (52) 

K 

2. Type-B blocks consist of a configuration 
with K — w ones and w zeros: 

w 

B = 111 ■ ■ -111 000^000 . (53) 

K-w 
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As will become clear, the parameter w 
controls the width of the barriers. 

3. All other 2 K — 2 configurations are de- 
noted as Type-* blocks. 

• A genotype s with blocks 1 through n — 2 of 
type B and block n— 1 of type A receives fitness 
/(s) = n. These genotypes have the structure: 



JV-ra+l 




Note that the configurations of blocks n 
through N are immaterial (denoted *) when 
the first Ti — 1 blocks occur in the above geno- 
type configuration. 

• Genotypes s with blocks 1 through n — 2 of 
type B , block n — 1 of type *, and block n 
of type A receive fitness /(s) = n — h. These 
genotypes have the structure: 

N—n 

BB ■ ■ ■ BB y *A7T^~T* , (55) 

Again, the configurations of blocks n + 1 
through N are immaterial. The parameter h 
controls the height of the fitness barrier. 

The Royal Staircase fitness functions that we stud- 
ied in Refs. Q and [^7j are a special case (w = 0) 
of the Royal Staircase with Ditches class of fitness 
functions. For the special case w = there are no 
fitness barriers and a genotype has fitness n when 
the first 7i — l blocks are A = B (all Is) types and 
the nth block is set to any of the 2^ — 1 other con- 
figurations. 

Setting w = 1 produces a somewhat degenerate 
case that we will not consider. 

For values of w > 2, there is a genuine fitness bar- 
rier of width w bits. For instance, consider the case 
where, at some point in time, the highest fitness in 
the population is / = 4. This corresponds to geno- 
types that have first and second blocks of type B 
and a third block of type A. In this case, a portal 
genotype Q n , corresponding to a fitness of 5, is ob- 
tained when the fourth block is set to type A and the 
third block is changed from type A to type B. Geno- 
types with fitness 4 can mutate their fourth block, 
until it becomes type A, without changing their fit- 
ness. That is, the fourth block may be changed 



into type A along a neutral path and setting the 4th 
block correctly corresponds to crossing an entropy 
barrier. However, after that, the third block needs 
to be changed from type A to type B. All interme- 
diate type * blocks give genotypes a reduced fitness 
of / = 4 — h. We call these ditch genotypes, since 
they are located in a lower-fitness region in genotype 
space that separates genotypes with fitness 4 from 
genotypes with fitness 5. 



B. Evolutionary Dynamics 



We evolve populations on the Royal Staircase with 
Ditches under a simple selection and mutation dy- 
namics similar to the one outlined in Sec. ||. This 
consists of the following steps. 

1. At time t = a population of M ran- 
dom binary-allele genotypes (bit sequences) of 
length L is created. These M individuals con- 
stitute the initial population. 

2. The fitness of all M individuals is determined, 
using the function defined in the previous sec- 
tion. 

3. M individuals are sampled from the popu- 
lation, with replacement, and with probabil- 
ity proportional to their fitness. That is, the 
population undergoes fitness-proportional se- 
lection in discrete generations. 

4. Each bit in each of the M selected individu- 
als is mutated with a probability u. The M 
individuals thus obtained form the new gener- 
ation. 

5. The procedure is repeated from Step 2. 

We evolve the population according to the above 
dynamics until genotypes of optimal fitness have 
been discovered and the population appears to have 
reached a stable average fitness. During each run, we 
estimate a number of statistics — such as, the aver- 
age time until individuals of a certain fitness appear 
for the first time. 
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C. Observed Population Dynamics 

The population dynamics under Royal Staircase 
with Ditches functions is qualitatively very similar 
to that under the Royal Road and Royal Staircase 
fitness functions. Samples of this typical behavior 
are shown in Figs. ||(a)-(d). The plots there show 



the average (/} (lower, solid lines) and best (up- 
per, dashed lines) fitnesses in the population over 
time for four single runs with four different param- 
eter settings. The parameter settings for each run 
are indicated above each figure, except for the bar- 
rier widths w and barrier heights h that were used. 
All runs used barriers of widths w = 2 and heights 
h = l. 
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FIG. 5. Four runs of the Royal Road with Ditches population dynamics with four different parameter settings. 
The upper dashed lines plot best fitness and the lower solid lines, average fitness (/) in the population as a function 
of time, measured in generations. The parameter settings for each run are indicated above each figure. All fitness 
functions have barriers with a width of w — 2 and a height of h = 1. The values of the average fitnesses were obtained 
by taking a running average over 10 generations. This reduces the relatively large fluctuations in average fitness 
between successive generations. 



The qualitative dynamics follows the typical alter- 
nation of long periods of stasis (epochs) in the aver- 
age population fitness and short bursts of innovation 
to higher average fitness; a class of evolutionary dy- 



namics that we call epochal evolution ]29| ]. At the 
beginning of a run the average and best fitness are 
low. This simply reflects the fact that high-fitness 
genotypes are very rare in genotype space and thcre- 
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fore do not occur in an initial random population. 
A balance is quickly established between selection 
and mutation that leads to a roughly constant aver- 
age fitness in the population. This period of stasis 
we call an epoch. After some time, a mutant may 
cross the fitness barrier and discover a portal, i.e., a 
higher-fitness genotype. Relatively frequently, this 
high-fitness mutant is lost through sampling fluc- 
tuations or deleterious mutations. Such events are 
seen as isolated spikes in the best fitness in Fig. |f| 
Eventually, one of these beneficial mutants spreads 
through the population — an innovation occurs. At 
this point the average fitness increases, until a new 
equilibrium between selection and mutation is estab- 
lished. 

Although many properties of epochal evolution 
can be treated analytically |$0|, here we will focus 
solely on the epoch times. These are the average 
times between the start of a given epoch and the 
start of the next. Average epoch times can be ob- 
tained from simulation data by tracing backward in 
time the behavior of the population's best fitness. 
The population dynamics runs until genotypes of the 
highest possible fitness N+l have established them- 
selves in the population for an extended period of 
time. (For certain parameter settings — specifically, 
those beyond the error threshold — genotypes with 
the highest fitness may not stabilize in the popula- 
tion at all. For such parameter settings, an epoch 
time can be effectively infinite. We will not consider 
such parameter settings explicitly here.) From there 
we trace backwards the last time t n that fitness n 
was the highest in the population, for all values of 
n. The differences t n — give the epoch times 
r„. Some fitness levels may not occur during a run. 
For instance, in Fig. ||(a) fitness / = 1 never oc- 
curs, since the population starts out with genotypes 
of fitness 2. The epoch time t\ is therefore for 
this particular run. Average epoch times are then 
estimated by averaging the epoch time T n over an 
ensemble of runs. In the following we calculate ana- 
lytical approximations to these epoch times t„ , using 
the results from the preceding development. 



D. Epoch Quasispecies and The Statistical 
Dynamics Approach 



In order to approximate epoch times analytically, 
we need to determine the average proportion of 
the population at highest fitness duri ng each epoch. 
This is the equivalent of Pn in Sec. III. From this 



we can estimate the rate of creation of genealogies in 
the ditch between the neutral networks of two succes- 
sive epochs. Then we need to calculate the average 
population fitness during each epoch to determine 
the average life time of these ditch genealogies. For 



the fitness functions studied in the Sec. |I| these 
quantities were relatively straightforward to calcu- 
late. There, individuals were either on the peak or 
in the valley, at a certain distance from the portal. 
For the Royal Staircase with Ditches fitness func- 
tions the situation is more complicated. 

In principle, one can calculate the current equiv- 
alent of Pn by representing the population as a dis- 
tribution of genotypes and calculating metastable 
genotype distributions for each epoch. This is typi- 
cally done in population genetics models and in 
the standard quasispecies models of molecular evo- 
lution |Q. However, since genotype spaces are typ- 
ically very large, an analytical treatment that ex- 
plicitly takes into account finite-population effects is 
generally infeasible within this genotypic representa- 
tion. To address this problem, we introduced an al- 
ternative approach that we call statistical dynamics 
[p|, p9pc| l. There one chooses a relatively small num- 
ber of macroscopic variables with which to describe 
the population at any given time. Other degrees 
of freedom are then averaged out using a maximum 
entropy method similar to the Gibbs method from 
statistical mechanics. We will use this approach be- 
low and simply refer the reader to Refs. Q and J3(J 
for more extensive treatment of statistical dynam- 
ics and a discussion of the relation of this approach 
to standard quasispecies theory, mathematical pop- 
ulation genetics, and other theories from the field of 
evolutionary computation |25[] . 

We represent the population state at any point 
in time by a fitness distribution. That is, instead 
of describing the population by the relative frequen- 
cies of all genotypes in the population, we describe 
it by the relative frequencies of different fitness val- 
ues. Of course, a given fitness distribution does not 
uniquely specify a population of genotypes. In or- 
der to construct the dynamics on the level of fitness 
distributions, we must "average out" the additional 
genotypic degrees of freedom somehow. We do this 
by assuming, at each generation, a maximum en- 
tropy distribution of genotypes given the distribu- 
tion of fitness. In the Royal Staircase with Ditches 
fitness functions, this translates into assuming that 
each string with a given fitness / is equally likely 
to be any of the genotypes with fitness /. That is, 
a genotype with fitness n will have its first n — 1 
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blocks each in a specific state. Blocks n through 
N are assumed to occur in any of their 2 K possible 
states with equal probability. Similarly, for a "ditch" 
genotype with fitness n — h, there are n — 1 blocks in 
fixed genotypic states and one block of type *. We 
assume this *-block occurs in any of its 2 K — 2 pos- 
sible bit configurations with equal probability. Sim- 
ilarly, we assume that blocks n + 1 through N are 
equally likely to be in any of their 2 K configurations. 
That is, we assign equal probability to all genotype 
distributions that are consistent with the given fit- 
ness distribution. Since we know the dynamics on 
the genotype level, we can construct the expected 
dynamics on the level of fitness distributions under 
these assumptions. 

Formally, we represent the population at time 
f as a vector P(t) with components P n (t),n = 
1,2,..., N+l that denote the proportion of fitness-n 
individuals in the population and with components 
Pn,*{t) that denote the proportion of fitness n — h 
individuals in the ditch between fitness n and fitness 
n+l genotypes. (Note that in the cases where h is an 
integer the distribution P(t) is not simply a fitness 
distribution, since it distinguishes ditch individuals 
from nonditch individuals at the same fitness.) 

We then construct a generation operator G, simi- 
lar to the one in section III A , that acts on a fitness 



distribution P(t) and returns the expected fitness dis- 
tribution (P(t + 1)) at the next generation. That is, 
the dynamics at the level of fitness is to be given by 



(P(t+l)) = 



G - P(t) 
(/) 



(56) 



where (/) is the average fitness in the population. 

During epoch n, in which the best fitness occur- 
ring in the population is n, there will be a roughly 
constant fitness distribution P™. These vectors 
P n are the solutions of the fixed point equations 
(P{t+1)) = P(t) determined by Eq. (||). Once the 
operator G is constructed, the epoch distribution 
can be calculated quite easily. More specifically, in 
Refs. [^9| and we showed that the metastable 
fitness distribution that occurs during epoch n is 
determined by projecting the operator G onto all 
dimensions with fitness smaller than or equal to n 
and calculating the principal eigenvector of this pro- 
jected operator. Determining the epoch-n quasis- 
pecies reduces, in this way, to finding the principal 
eigenvectors of the matrix G restricted to compo- 
nents with fitness lower than or equal to n. In App. 



[A| the generation operator G for the Royal Staircase 
with Ditches is constructed explicitly, and analytical 
approximations to the epoch quasispecies distribu- 
tions P n are calculated as well. Since the expressions 
we find for the fitness distributions P n are rather 
cumbersome and we will not give them here. 



E. Crossing the Fitness Barrier 

With the analytical expressions for the epoch fit- 
ness distributions P n in hand, we can calculate the 
expected epoch times r„. An epoch ends via an 
innovation — a process that proceeds in two stages. 
First, a portal genotype of fitness n + 1 is created. 
And second, this beneficial mutant spreads through 
the population, rather than being lost. 

In order to calculate the average time until a por- 
tal genotype is discovered we calculate the probabil- 
ity p seed that a single selection plus mutation from 
the current population seeds a new lineage that dis- 
covers the portal. That is, either by creating a new 
lineage in the ditch that discovers the portal or else 
by producing a jump mutation that becomes a por- 
tal genotype at once. 

This calculation is very similar to that in Sec. 
III. First of all, the portal is unlikely to be dis- 
covered by anything other than either a jump mu- 
tation from a fitness-n individual or by a mutation 
from a ditch genotype. Moreover, ditch lineages are 
very unlikely to be seeded by anything other than 
mutants of fitness-n genotypes. Therefore, we can 



write P 



seed 



pseed = nP:(l-^-V K 



fn2 K 



K 



Y^ti (M™ - (1 - flfSr 



(57) 



i=0 



The first factor nP"//„ gives the probability that 
a fitness n individual is selected. For the offspring 
of this individual to end up in the ditch, its nth 
block should be type A (thereby contributing the 
factor 2~ K ) and its first n — 2 blocks should be left 
undisturbed by mutation (corresponding to the fac- 
tor (1 — /i)("~ 2 ' x ). The terms within the sum give 
the probability that a valley lineage is seeded by mu- 
tation of the (n — l)st block at Hamming distance i 
from the portal. Finally, the factors ti give the prob- 
abilities that a lineage, starting at Hamming dis- 
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tance i from the portal, discovers the p ortal. They 
are analogous to the of Sec. Ill B| . Note that 
the term eoM „, = 1- fx w (l — fx) K ~ w corresponds to 
a jump mutation from a fitness-n genotype directly 
to a portal configuration. Again, changes to blocks 
n + 1 through N are immaterial. 

The €j are calculated paralleling the development 
in Sec. Ill B . The only differences are that the geno- 
type length L is now the block length K here, since 
we are only interested in mutations in the (n — l)st 
block, and that the average number of offspring is 
slightly modified. The average number of ditch off- 
spring r that an individual in the ditch produces on 
average is given by 



(n - d){l - n)^- 1 ^ 

fn 



(58) 



7r n = 1 - exp 



, (n +1)(1- 



(61) 



A mutant must be discovered 1 /n n times on average 
before it stabilizes and spreads through the popula- 
tion. Thus, g n /ir n gives the average number of gen- 
erations until a portal genotype is discovered that 
spreads through the population — an innovation. 

Finally, we have to account for the possibility that 
epoch n does not occur at all during a run. Only a 
fraction P e (n) of the runs contain epoch n, since, if a 
higher-fitness genotype occurs in the initial random 
population, epoch n will be skipped. The proportion 
P e (n) is simply the probability that no genotypes of 
fitness greater than n occur in the initial population. 
This is given by 



The expressions for the e, in this case can be ob- 
tained by rep lacing the factor 1/ (/) in the equations 
in Sec. G II B| by r. 



Eq. t p7\ ) can be further simplified and we eventu- 
ally obtain the result that 



pseed 



pn 
_n 

2 K 



(59) 



(1- fj,) K (n-d) 



log(l 



P e (n) 



E 

,i=0 



1 

2 Ki 



1 



M 



(62) 



Putting Eqs. (|60|), (|6l|), and (|62j) together, the the- 
oretical predictions for the epoch times r n become 



Pe(n) 



9n 



(63) 



where e w is given by Eqs. (151) and ( |16| ) and P r " is 
determined as outlined in App. |a| . 

Thus , once we have calculated e w using the results 
of Sec. Ill B and calculated P™ using the derivation 
described in App. |A|, we find that the average num- 
ber g n of epoch-n generations until a genotype of 
fitness n + 1 is created is given by 



1 



g n = 



l - (l - P seed ) 



M ' 



(60) 



In this, we ha ve neglected the correction term (dt) 
of Sec 



hid 



for the time between the seeding of 
the ditch lineage that leads to the discovery of the 
portal and the actual time at which the portal is 
discovered. 

We now must calculate the second stage of epoch 
termination; namely, the probability 7r ra that a newly 
discovered mutant of fitness n + 1 spreads through 
the population. In Ref . [ 3(1 we showed how a diffu- 



sion equation approach 17|lq ] can be used to esti- 
mate this probability. Applying this here, the prob- 
ability 7r„ that a mutant of fitness n + 1 will spread 
through the population is given by: 



F. Theoretical and Experimental Epoch Times 



We tested the predictions of Eq. ( |63| ) against ex- 
perimentally obtained average epoch times for the 
same parameter settings used in Fig. ^. The results 
are shown in Fig. |^. Epoch times r„ are shown as a 
function of epoch number n. The dashed lines give 
the theoretical predictions of Eq. ([33|) and the solid 
lines, the experimentally estimated averages. The 
data in Fig. ||(a) is an average over 150 evolution- 
ary runs. All other plots are averages over 250 runs. 
The fluctuations in the experimental lines indicate 
that there is a large variance of epoch times between 
runs and that, therefore, the raw data is rather noisy. 
Despite this, the figure demonstrates that the theory 
estimates epoch times quite accurately. 

First, these results show that the analysis pre- 
sented in Sec. Ill can be usefully applied to more 



complicated fitness functions that posses many fit- 
ness and entropy barriers. For simplicity in the 
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present case, we restricted ourselves to fitness func- 
tions with barriers of equal height and width. How- 
ever, since our statistical dynamics methods analyze 
epochs in a piecewise manner, the theoretical results 
easily extend to more general cases with barriers of 
variable width and height. The essential ingredi- 
ent of the analysis is still the genealogy statistics of 
valley lineages in the ditch that connects to portal 
genotypes. The only additional ingredients required 
by the analysis for more complicated cases are (i) the 
rate of creation of new lineages in the ditch and (ii) 



the distribution of Hamming distances to the portal 
at which these lineages are seeded. In the case of the 
Royal Staircase with Ditches, this was largely deter- 
mined by the proportion P™ of fitness n individuals 
during each epoch n. Apart from this determining 
factor, the actual crossing of fitness barriers is still 
gove rned by the scaling relations presented in Sec. 
Ill F. In particular, the qualitative remarks at the 



end of Sec. Ill F still hold in these more general set- 
tings. Epoch times grow very rapidly with barrier 
width and quite slowly with barrier height. 



(a) N=10, K=5, n=0.001, M=500 



(b)N=10, K=5, (1=0.003, M=500 
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(c) N=10, K=5, |i=0.01, M=500 
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FIG. 6. Experimentally estimated (solid lines) and theoretically predicted (dashed lines) epoch times r n for the 
four different parameter settings of Fig. |^ as a function of epoch number n. All fitness functions have ditches of 
width w — 2 and height h = 1. The experimental epoch times are an average over 150 runs for Fig. (a) and over 250 
runs for Figs, (b), (c), and (d). 
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Second, and perhaps of more immediate interest, 
the general shape of the curves in Fig. ^ reveals sev- 
eral novel population dynamical phenomena. For 
the lower mutation-rate runs (Figs, ^(a), ||(b), and 
^(d)), the epoch times show a distinct break between 
the relatively small t\ and the much larger . This 
jump simply reflects the fact that for these popula- 
tion sizes, it is very likely that individuals with fit- 
ness 2 occur in the initial random population. Due 
to this, epoch 1 is skipped most of the time. How- 
ever, individuals of fitness 3 are unlikely to occur 
in the initial population — the occurrence of a fitness 
3 individual is roughly one in 10 3 — so that the sec- 
ond epoch is not skipped. In Fig. ||(a) and ^(b), 
the epoch time curve reaches a maximum after this 
quick jump and then drops off slightly. In Fig. ||(a) 
with mutation rate /i = 0.001 the times seem to just 
reach a minimum around the last epoch, number 10. 
The curves in Fig. ^(b) reach a minimum epoch time 
somewhere around epoch 5 and then start increasing 
again. The behavior typically occurs for a range of 
parameter values with low mutation rates and with 
block sizes K that are not too small. For instance, 
the theoretical analysis predicts that if there were 
more than 10 blocks in the case of Fig. ^(a), then 
the curve would start to rise after epoch 10. 

Qualitatively this behavior can be understood as 
follows. Since the barriers all have equal height h, 
the barriers at later epochs are relatively more shal- 
low than those visited during early epochs. From 
the analysis in the last section recall that the aver- 
age number of offspring that ditch individuals pro- 
duce in the ditch is r = (n — h)/n. As n increases, 
this number approaches 1 from below. That is, ditch 
lineages survive longer for later epochs. This effect, 
of course, decreases epoch times, and this causes the 
initial decrease of epoch times in Figs. |6|(a) and^(b). 
However, the ditch lineages are seeded by mutants of 
fitness-n individuals. For later epochs, fitness n indi- 
viduals have many more bits, (n — 1)K, that need to 
be set to specific configurations. They are, therefore, 
more likely to undergo deleterious mutations. The 
proportion P T " of fitness-rt individuals during epoch 
n thus decreases as a function of n. This tends to 
increase the epoch times since smaller P™ implies a 
lower rate of seeding of ditch lineages. Moreover, the 
probability 7r„ that a genotype of fitness n + 1, once 
found, spreads through the population decreases as 
n increases as well. Eventually, these effects start 
to dominate, and epoch times start rising again. In 
fact, at a certain point, for large n, the probabil- 
ity 7r„ may become so small that fitness-n + 1 in- 
dividuals cannot be stabilized in the population at 



all. In this regime, the dynamics again reaches the 
well known error threshold: the population dynam- 
ics cannot store the necessary nK bits of information 
that define epoch n + 1. 

In Figs. ||(c) and ^(d), there is no initial decrease 
of epoch times: the epoch times increase monotoni- 
cally as a function of n. In this case, from the start 
of the runs the decrease of 7r„ and P™ with n domi- 
nates the effect of the relatively shallower ditches. 

N=10, q=0.003, M=500, K=5,9,1 1 ,12 
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FIG. 7. Comparison of the entropy- and fit- 
ness-barrier crossing times for the Royal Staircase with 
Ditches fitness function. The dashed line gives the epoch 
times for the case of minimal width (w = 2) ditches of 
height h — 1. This data is the same as that in Fig. [j(b). 
The solid lines plot the entropy-barrier epoch times for 
a fitness function without ditches (w = 0) for several 
different block lengths: K = 5, 9, 11, and 12. All other 
parameters are identical to the ditch-case parameters: 
i.e., N — 10 blocks, a mutation rate of fi — 0.003, and a 
population size of M = 500. The vertical axis is shown 
on logarithmic scale. 

Finally, we compare these epoch times, for ditches 
of height h = 1 and width w = 2, with the epoch 
times for the entropy-barrier case (w = 0) at differ- 
ent block lengths K. This comparison is shown in 
Fig. |?]. Epoch times r are shown on a logarithmic 
scale. The dashed line shows the experimentally ob- 
tained data for the ditch case with N = 10 blocks 
of length K = 5, a (minimal) ditch width of w = 2, 
and height of h = 1, a population size of M = 500, 
and a mutation rate of fi = 0.003; as used in Fig. 
^(b). The solid lines show experimentally estimated 
epoch times for the neutral case (w — 0), for several 
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different block lengths. All other parameters are the 
same. 

In comparing the solid line for K = 5 with the 
dashed line we see that the introduction of this min- 
imal ditch increases the epoch times by factors rang- 
ing from 50 to 250. In order to obtain roughly com- 
parable epoch times for the neutral case w = 0, one 
has to increase the block length to as high as = 12. 
This demonstrates how much more rapidly entropy 
barriers are crossed than fitness barriers. In the time 
that it takes the population to cross a fitness barrier 
of width w = 2, the neutrally diffusing population 
will have crossed an entropy barrier of an additional 
12 — 5 = 7 bits. Thus, the neutrally diffusing pop- 
ulation explores roughly 2 7 = 128 times as many 
different neutral configurations in the time that it 
takes to cross the 2-bit fitness barrier. As we will 
see below, the difference in time scale for crossing 
fitness and entropy barriers grows much larger for 
more realistic (lower) mutation rates and (longer) 
sequences. 



VI. CONCLUSIONS 



We analyzed in detail the barrier crossing dynam- 
ics of a population evolving under selection and mu- 
tation in a constant selective environment. Barriers 
of two distinct types exist: fitness barriers and en- 
tropy barriers. Fitness barriers occur in a fitness 
"landscape" when genotypes of higher fitness than 
currently present in the population are separated 
from the current most-fit genotypes by valleys of 
lower fitness. In order for such barriers to be crossed, 
a rare sequence of mutants must cross the valley of 
low-fitness genotypes. The second type of barrier, 
entropy barriers, occurs when genotypes of current 
best-fitness form large neutral networks in genotype 
space that only have a small number of connections 
(portals) to genotypes of higher fitness. Evolving 
populations diffuse at random through these neu- 
tral subbasins under selection and mutation. Since 
connections to higher fitness genotypes are rare, the 
population must search and spread over large parts 
of the neutral network, before higher-fitness geno- 
type portals are discovered. 

We will now qualitatively and quantitatively sum- 
marize the general picture that has emerged from 
our analysis of barrier crossing dynamics. The first 
important observation to be made is that there is 
a large qualitative difference between the genealo- 



gies of individuals of current best fitness and those 
with suboptimal fitness. All suboptimal individu- 
als in the population are relatively recent descen- 
dants of genotypes with the current highest fitness. 
In other words, individuals with suboptimal fitness 
only give rise to genealogical bushes — genealogies of 
finite, typically short, length. Individuals with the 
current highest fitness are the only ones that give 
rise to lineages of potentially infinite length. 

More formally, let us denote by A the neutral net- 
work of genotypes whose fitness equals that of the 
current highest-fitness individuals. The subpopula- 
tion of the current population that is on the neu- 
tral network A then effectively acts as a source for 
the whole population's descendants. Genealogies of 
lower-fitness individuals outside of A go extinct rel- 
atively quickly and are replaced by new genealogies 
that are mutant descendants of individuals on the 
neutral network A. Therefore, only lineages of indi- 
viduals on the network A can "travel" long distances 
through genotype space. The subpopulation of geno- 
types in A diffuses randomly through A, eventually 
visiting almost all genotypes in A and all of their 
single mutant neighbors. 

At the same time, the excess reproduction of indi- 
viduals in A, combined with deleterious mutations, 
creates individuals in the lower-fitness valleys or 
ditches around A. These individuals then give rise to 
genealogies of valley individuals that, in turn, probe 
at random the genotype space surrounding A. How- 
ever, since these valley genealogies typically go ex- 
tinct quite rapidly, individuals in the valley never 
travel far from the neutral network A. That is, only 
those parts of the valleys that are in the immedi- 
ate neighborhood of the neutral network A will be 
quickly explored. Valley genealogies that cross a 
wide valley are very rare. 

Therefore, on the shortest time scale, the popula- 
tion explores the neutral network A and its neigh- 
borhood, as shown in Fig. || If there are any higher- 
fitness genotypes in the immediate neighborhood of 
the network A, the population is most likely to es- 
cape from the metastable state (epoch) by cross- 
ing this entropy barrier. If the neutral network 
A is completely surrounded by valleys of lower fit- 
ness, the population will eventually escape from the 
metastable state when a mutant crosses one of the 
valleys that surrounds A, i.e., by crossing a fitness 
barrier. Since the waiting time for such a fitness bar- 
rier crossing increases very rapidly with the width of 
the barrier, it is most likely that the escape will oc- 
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cur along one of the narrowest valleys that connects 
A to higher fitness genotypes. Although shallow val- 
leys are more easily crossed than deep valleys, the 
effect of barrier height is relatively small compared 
to the effect of barrier width. 

This basic picture of the metastable population 
dynamics is captured more quantitatively by the 
scaling relations of Eqs. ( |33|) and (49) for the fitness- 
barrier and entropy-barrier crossing times, respec- 
tively. By equating these, we get a rough compari- 
son of the trade offs in the crossing of entropy ver- 
sus fitness barriers. To make the comparison for 
more general cases, we replace the factor 2 L in the 
entropy-barrier scaling form, Eq. (f49|), with V\ to 
denote the volume of the neutral network A. We also 
replace L by the average neutrality v — i.e., the av- 
erage number of neutral neighbors — of genotypes in 
A. This extends the entropy-barrier scaling form to 
more general neutral network topologies than simple 
binary hypercubes. We then obtain 



(64) 



This is an estimate of the volume Va of the network 
A that is explored in the time it takes the population 
to cross a fitness barrier of height a and width w. 

Recall that the factor wl denotes the number of 
different paths leading across the fitness barrier and 
that \ij log(cr) gives the average number of mutations 
valley lineages undergo before they go extinct. The 
relative size of \x and log(cr), together with the bar- 
rier width w, are the decisive quantities. Only when 
fj,/ log(cr) approaches 1 can fitness barriers be crossed 
relatively quickly. However, since fi is typically very 
small, even in cases where the highest-fitness geno- 
types have only a small fitness advantage over valley 
genotypes — i.e., a = 1 + e — the determining ratio 
fi/e may still be small. 

As an illustration, assume that each genotype in A 
has an average of v = 60 neutral neighbors, that the 
mutation rate is fx = 10~ 6 , and that genotypes in A 
are on average 1 percent more fit than genotypes in 
the valley. Substituting these values into Eq. (|64l) 
for a barrier of width w — 3, we find Va l(r . 
Thus, before this fitness barrier will be crossed, on 
the order of f 9 genotypes on the neutral network A 
will have been explored. Of course, these numbers 
should not be taken as quantitative predictions, they 
are only rough order-of-magnitude estimates. How- 
ever, they do indicate the difference in time scales 
at which entropy and fitness barriers are crossed. 



In cases where the fitness advantage e becomes 
so small as to be comparable in size to fi, the fit- 
ness barrier will have effectively turned into an en- 
tropy barrier. That is, for such small fitness advan- 
tages, selection is not able to stabilize the popula- 
tion within A and the population will freely diffuse 
through the valleys. In short, before e becomes so 
small as to be comparable to /i, the population will 
have already crossed the error threshold. This can 
be seen, for instance, by taking the logarithm on 
both sides in Eq. @. 

As we discussed earlier, the picture of the barrier 
crossing dynamics that has emerged from our anal- 
ysis contrasts strongly with the common view that 
this process is analogous to the dynamics of energy- 
barrier crossing in physical systems. In those sys- 
tems, the stochastic dynamics follows the local gra- 
dient of the energy landscape. This local character 
of the dynamics causes the barrier height to be the 
main determinant of the barrier crossing time. Gen- 
erally, the evolutionary population dynamics can- 
not be described as following a local gradient. The 
current highest fitness of the individuals located 
on the neutral network A sets an absolute fitness 
scale against which valley individuals must compete. 
Therefore, valley lineages are unlikely to survive for 
many generations and so cannot undergo many mu- 
tations before going extinct. This mechanism causes 
the barrier width to be the main determinant of the 
fitness-barrier crossing time in population dynamics. 

The preceding results demonstrate the important 
role of neutral networks in genotype space for evolu- 
tionary dynamics: where a population goes in geno- 
type space is largely determined by the available 
neutral paths. 
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APPENDIX A: ANALYTICAL 
APPROXIMATION OF THE EPOCH 
QUASISPECIES 



where |Aj| is the size of the set Aj. That is, we 
assume that a type j genotype is equally likely to 
be any of its possible \Aj\ genotypes. We then sum 
over all possible genotypes of type i to which j can 
mutate. The generation operator G is then, as 
before, the product of the selection operator, Eqs. 
(Al) and (A2), and the resulting mutation opera- 
tor: G = MS. 



We now list expressions for the different types of 
G's components. In order to simplify the formulas, 
wc first define 



In this appendix, we construct the generation op- 
erator G for the class of Royal Staircase with Ditches 
fitness functions, and analytically approximate the 
metastable quasispecies distributions P n . 

As before, the operator G decomposes into a part 
S that encodes the expected effects of selection on 
the population fitness distribution and a mutation 
operator M that encodes the expected effects of mu- 
tation. 

The selection operator S is easy to construct since 
selection depends only on the population's fitness 
distribution. After selection, we have for the non- 
ditch genotypes that: 



P. 



scl 



/ n _ p 



and for the ditch genotypes that 

h 



psel _ 

n,* 



1 n. 



(A2) 



Construction of the mutation operator M is more 
involved, but straightforwardly follows Ref. [ p0[ . 
Formally, the probability My that a genotype of 
type j turns into a genotype of type i under mu- 
tation is given by a sum over all genotypes of type i 
and j, weighted by the probability of mutating from 
one genotype to the other. When we refer to the 
"type" of a genotype we distinguish genotypes only 
based on their location in an epoch's neutral net- 
work (n) or intervening ditch (n, *). We denote by 
Aj the set of all genotypes of type i and by d(s, s') 
the Hamming distance between the genotypes s and 
s'. Formally, we then have: 



Mi 



E E 

seA; s'eA. 



d(s,s ) 
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L — d(s,s ) 



|A., 



(A3) 



(A4) 



and the probability A to not mutate a block, which 
is given by 



A=(l- M ) if 



(A5) 



Using this notation, the probability to mutate a 
block from type A to type B, for instance, becomes 
Pr(A — » B) = aX. Components Gy with i < j 
become 



Gij = aX^j 



(Al) for i ^ 1, and 



Gij = I 1 - A^ 1 - a 



X - A J " 
1 - A 
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(A6) 



(A7) 



for i = 1. The diagonal components are given by 

G n = A' . (A8) 

For the ditch genotypes, the diagonal components 
are: 



G 



U~h)U-h) 



Xi-^j-h) 



(A9) 



Components describing the transitions from ditch 
genotypes to lower-fitness nonditch genotypes i < j 
are given by: 



G 



iU-h) 



aX^U-h) . 



(A10) 



Finally, mutations from nonditch genotypes to lower 
lying ditch genotype come in three varieties. First, 
for i < j — 2 we have 
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G(i- h )j = a\ l l {\ - X - Xa)j . 

For i = j — 1 we have 

X<- 2 {1- \ ~ a\)j 



G 



(All) 



(A12) 



And for the ditch lying immediately below, i = j, 
we have 

= ^'~ 2 (1 - A - aA)j . (A13) 



The quasispecies fitness distribution during epoch 
n is given by the principal eigenvector of the matrix 
G restricted to the components with fitness lower 
than or equal to n; see Refs. pi] and |30|. This 



eigenvector can be obtained numerically, using the 
above formulas for the components Gy . 

An analytic approximation to P™ can be obtained 
by considering mutations only from each fitness j (or 
j — h) to equal or lower fitness. This approximation 
is justified by the fact that fitness-increasing muta- 
tions are very rare compared to fitness-decreasing 
mutations. Under that approximation, the matrix 
G becomes upper triangular. For upper triangular 
matrices, the components P™ of the principal eigen- 
vector P™ obey the equations: 



pn _ 



En f~i pr, 
3>i 



Gnn Gi 



(A14) 



and, since P n is a distribution, we additionally have 
the normalization condition given by 



(A15) 



Note that the sums in the above equations involve 
both ditch genotypes and nonditch genotypes. Their 
precise ordering with respect to fitness depends, of 
course, on the barrier height h. For instance, for 
< h < 1, the fitness n — h genotypes in the ditch 
between fitness-n and fitness- (n + 1) genotypes lie 
between fitness n — 1 and n. 

With the analytic expressions fo r the components 
of G in hand, and by using Eq. (A14), we can ex- 
press all P™ for i < n as a function of P". For in- 
stance, assuming h < 1, such that the genotypes of 
fitness n — h are the second highest- fitness genotypes 
in the population, we have 



pn 



n(l — A — aX) 
2 K hX 



pr, 



(A16) 



Using this, we have for genotypes of fitness n — 1 
that 



n(a(n-h)(l-X-aX) + 2 K X) n , 



(1 — n(l — A)) X2 K 



(A17) 



When all components P™ with i < n have been ex- 
pressed in terms of P™ in this w ay, on e finally uses 
the normalization condition Eq. ( A15) to determine 



pn 
n " 

This procedure leads in a straightforward way to 
a relatively accurate analytical expression for the 
epoch fitness distributions. The expressions, how- 
ever, are rather cumbersome and we will not list 
them all here. In general, they depend on the size 
of h since the ordering of the fitness values of the 
different types of genotypes depends on h. 
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